Archive
TSQL Interview Questions – Part 3
Next part of TSQL Interview Questions, contd. from my previous post.
51. What are Integrity Constraints?
https://sqlwithmanoj.com/2010/11/23/integrity-constraints/
52. Difference between:
– Views, Tables & Stored Procedures
– Stored Procedures & Functions
– Sub Query & Co-related sub-query
– Physical & Logical Schema
– Table variable & Temporary Tables – YouTube
– UNIQUE and CLUSTERED INDEXES
– Triggers and Constraints
– Primary Key & Unique Key
53. What do you mean by Referential Integrity? How will you attain it?
By using Foreign Keys.
http://www.databasedesign-resource.com/referential-integrity.html
54. What is the sequence for logical query processing, what is the order?
FROM, [JOIN CONDITION, JOIN TABLE …], ON, OUTER, WHERE, GROUP BY, CUBE/ROLLUP/GROUPING SETS, HAVING, SELECT, DISTINCT, ORDER BY, TOP
https://sqlwithmanoj.com/2010/10/28/sql-logical-query-processing-order/
55. How you debug Stored Procedures?
http://support.microsoft.com/kb/316549
http://www.sqlteam.com/article/debugging-stored-procedures-in-visual-studio-2005
http://www.15seconds.com/issue/050106.htm
56. What is ANSI_NULL?
https://sqlwithmanoj.com/2010/12/10/set-ansi_nulls-quoted_identifier-ansi_padding/
57. How will you rename a table?
By using sp_rename stored procedure.
58. What are ACID properties, define them?
https://sqlwithmanoj.com/2014/09/28/db-basics-what-are-acid-properties-of-a-transaction-in-an-rdbms/
59. What is a Live Lock?
http://social.msdn.microsoft.com/Forums/en/sqldatabaseengine/thread/478aa50f-b7dd-43fb-bb90-813057a6a1ed
http://blog.sqlauthority.com/2008/03/21/sql-server-introduction-to-live-lock-what-is-live-lock/
60. Difference between “Dirty Read” & “Phantom Read”. Explain both of them?
https://sqlwithmanoj.com/2011/07/20/dirty-reads-and-phantom-reads/
61. What is BITMAP index and BITMAP filtering?
http://msdn.microsoft.com/en-us/library/bb522541%28v=SQL.100%29.aspx
http://social.msdn.microsoft.com/Forums/hu-HU/sqldatabaseengine/thread/4717addd-1c8d-4c6b-8607-e191324c1cd8
62. What are different ISOLATION Levels (High to Low)?
http://msdn.microsoft.com/en-us/library/ms189122.aspx
– SERIALIZABLE
– SNAPSHOT
– REPEATABLE READ (phantom read)
– READ COMMITTED
– READ UNCOMMITTED (dirty read)
http://blogs.msdn.com/b/sqlcat/archive/2011/02/20/concurrency-series-basics-of-transaction-isolation-levels.aspx
63. What is the highest, lowest & default ISOLATION Level?
Highest: SERIALIZABLE
Lowest: READ UNCOMMITTED
Default: READ COMMITTED
64. What is SERIALIZABLE Isolation Level?
http://msdn.microsoft.com/en-us/library/ms173763.aspx
65. What is a CTE & how it is different from a Derived table? Example of recursive CTE.
https://sqlwithmanoj.com/2011/05/23/cte-recursion-sequence-dates-factorial-fibonacci-series/
66. What are design considerations for a Clustered Index & Non Clustered Index?
67. What are UDFs and their usage in a SELECT query?
– SCALAR
– TABLE VALUED
– MULTI LINE TABLE VALUED
https://sqlwithmanoj.com/2010/12/11/udf-user-defined-functions/
68. What are UNION, UNION ALL, EXCEPT & INTERSECTION keywords?
69. What are File Groups in SQL Server? What is its benefit? Explain any scenario where you will use multiple file groups.
http://msdn.microsoft.com/en-us/library/ms179316.aspx
70. How will you handle & avoid Deadlock?
http://www.sql-server-performance.com/tips/deadlocks_p1.aspx
http://support.microsoft.com/kb/169960
http://www.devx.com/getHelpOn/10MinuteSolution/16488/1954
71. What are Implicit Transactions?
http://msdn.microsoft.com/en-us/library/ms188317.aspx
http://msdn.microsoft.com/en-us/library/ms190230.aspx
72. How will you know Index usage on tables?
– Execution plan
– SET STATISTICS PROFILE ON
73. What are Indexed Views and their use? How will you create them?
http://msdn.microsoft.com/en-us/library/ms191432.aspx
A view must meet the following requirements before you can create a clustered index on it:
– The ANSI_NULLS and QUOTED_IDENTIFIER options must have been set to ON when the CREATE VIEW statement was executed.
– The ANSI_NULLS option must have been set to ON for the execution of all CREATE TABLE statements that create tables referenced by the view.
– The view must not reference any other views, only base tables.
– All base tables referenced by the view must be in the same database as the view and have the same owner as the view.
– The view must be created with the SCHEMABINDING option. Schema binding binds the view to the schema of the underlying base tables.
– User-defined functions referenced in the view must have been created with the SCHEMABINDING option.
– Tables and user-defined functions must be referenced by two-part names in the view. One-part, three-part, and four-part names are not allowed.
– If the view definition uses an aggregate function, the SELECT list must also include COUNT_BIG (*).
– If GROUP BY is specified, the view select list must contain a COUNT_BIG(*) expression, and the view definition cannot specify HAVING, ROLLUP, CUBE, or GROUPING SETS.
The SELECT statement in the view cannot contain the following Transact-SQL syntax elements:
– The * or table_name.* syntax to specify columns. Column names must be explicitly stated.
– An expression on a column used in the GROUP BY clause, or an expression on the results of an aggregate.
– A derived table.
– A common table expression (CTE).
– Rowset functions.
– UNION, EXCEPT or INTERSECT operators.
– Subqueries.
– Outer or self joins.
– TOP clause.
– ORDER BY clause.
– DISTINCT keyword.
– COUNT (COUNT_BIG(*) is allowed.)
– A SUM function that references a nullable expression.
– The OVER clause, which includes ranking or aggregate window functions.
– A CLR user-defined aggregate function.
– The full-text predicates CONTAINS or FREETEXT.
– COMPUTE or COMPUTE BY clause.
– The CROSS APPLY or OUTER APPLY operators.
– The PIVOT or UNPIVOT operators
– Table hints (applies to compatibility level of 90 or higher only).
– Join hints.
– Direct references to Xquery expressions. Indirect references, such as Xquery expressions inside a schema-bound user-defined function, are acceptable.
74. What do you mean by Concurrency control?
http://msdn.microsoft.com/en-us/library/ms189130.aspx
75. What do you understand by Star & Snowflake schema and whats the difference between them?
… more questions on next post Part-4.
Clustered Indexes, Non Clustered Indexes & why?
Creating Indexes on tables reduces the query retrieval time and increase the efficiency of SQL queries or statements fired against a database in SQL Server. Indexes are just like a Table of Contents in front side of the book or Index section at the back side of the book.
There are mainly 2 types of Indexes, CLUSTERED & NON-CLUSTERED index which can be created on a table.
– Clustered indexes are similar to a telephone directory where you search a person’s name alphabetically and get his phone number there only.
– Non Clustered indexes are similar to the Index of a book where you get the page number of the item you were searching for. Then turn to that page and read what you were looking for.
According to MS BOL one can create only one Clustered index & as many 249 Non Clustered indexes on a single table.
But why there is a need to create these indexes, what causes the fast retrival of data from the tables.
Let’s check this by creating a large table and creating these Indexes one by one and checking as we go one:
USE [AdventureWorks] GO select * from Sales.SalesOrderDetail -- Total 121317 records select * from Production.Product -- Total 504 records SELECT s.SalesOrderDetailID, s.SalesOrderID, s.ProductID, p.Name as ProductName, s.ModifiedDate INTO IndexTestTable FROM Sales.SalesOrderDetail s JOIN Production.Product p on p.ProductID = s.ProductID GO -- Test the table without any Indexes which is also a HEAP SELECT TOP 10 * FROM IndexTestTable --//////////////////////////////////////////////// --// Scenario 1 : When there is no Clustered Index --//////////////////////////////////////////////// SET STATISTICS PROFILE ON SET STATISTICS TIME ON SELECT SalesOrderDetailID FROM IndexTestTable WHERE SalesOrderDetailID = 60000 SET STATISTICS TIME OFF SET STATISTICS PROFILE OFF GO
CPU time = 15 ms, elapsed time = 17 ms. SELECT [SalesOrderDetailID] FROM [IndexTestTable] WHERE [SalesOrderDetailID]=@1 |--Table Scan(OBJECT:([AdventureWorks].[dbo].[IndexTestTable]), WHERE:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderDetailID]=[@1]))
It does a Table SCAN and takes 15ms of CPU time and elapsed time of 17ms. As the table is a HEAP so it will always SCAN the for matching rows in the entire table. To do a SEEK there must be an ordering of rows which can be done by putting a Primary key column which automatically creates a CLUSTERED INDEX OR creating a CLUSTERED INDEX explicitly shown below would do the ordering.
--// Create Clustered Index on SalesOrderDetailID column CREATE UNIQUE CLUSTERED INDEX IDX_UCI_SalesOrderDetailID ON IndexTestTable (SalesOrderDetailID) GO --// Check with Clustered Index SET STATISTICS PROFILE ON SET STATISTICS TIME ON SELECT SalesOrderDetailID FROM IndexTestTable WHERE SalesOrderDetailID = 60000 SET STATISTICS TIME OFF SET STATISTICS PROFILE OFF GO
CPU time = 0 ms, elapsed time = 1 ms. SELECT [SalesOrderDetailID] FROM [IndexTestTable] WHERE [SalesOrderDetailID]=@1 |--Clustered Index Seek(OBJECT:([AdventureWorks].[dbo].[IndexTestTable].[IDX_UCI_SalesOrderDetailID]), SEEK:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderDetailID]=[@1]) ORDERED FORWARD)
Now as shown above this does a Clustered Index SEEK after the creation of CLUSTERED Index. After creating an Index the CPU time is reduced to 0ms from 15ms and Elapsed time to 1ms from 17ms. The following queries will also do the SEEK operation:
SELECT SalesOrderDetailID FROM IndexTestTable WHERE SalesOrderDetailID > 60000 SELECT SalesOrderDetailID FROM IndexTestTable WHERE SalesOrderDetailID > 60000 AND SalesOrderDetailID < 70000 SELECT SalesOrderDetailID FROM IndexTestTable WHERE SalesOrderDetailID BETWEEN 60000 AND 70000 --//////////////////////////////////////////////////// --// Scenario 2 : When there is no Non-Clustered Index --//////////////////////////////////////////////////// SET STATISTICS PROFILE ON SET STATISTICS TIME ON SELECT SalesOrderDetailID FROM IndexTestTable WHERE SalesOrderID = 65000 AND ProductID = 711 SET STATISTICS TIME OFF SET STATISTICS PROFILE OFF GO
CPU time = 16 ms, elapsed time = 15 ms. SELECT [SalesOrderDetailID] FROM [IndexTestTable] WHERE [SalesOrderID]=@1 AND [ProductID]>@2 |--Clustered Index Scan(OBJECT:([AdventureWorks].[dbo].[IndexTestTable].[IDX_UCI_SalesOrderDetailID]), WHERE:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderID]=[@1] AND [AdventureWorks].[dbo].[IndexTestTable].[ProductID]>CONVERT_IMPLICIT(int,[@2],0)))
Now, for every situation or query the query optimizer will not do SEEK. In first query the INDEX was created on SalesOrderDetailID column, so it will not do a SEEK if query is applied on other columns. You would need to create a another INDEX for those columns. But you can create only one CLUSTERED INDEX. But yes you can also create as many as 249 NONCLUSTERED INDEXES on a table, as shown below.
--// Create Non-Clustered Index on SalesOrderID & ProductID columns CREATE NONCLUSTERED INDEX IDX_NCI_SalesOrderID_ProductID ON IndexTestTable (SalesOrderID, ProductID) GO --// Check with Non Clustered Index SET STATISTICS PROFILE ON SET STATISTICS TIME ON SELECT SalesOrderDetailID FROM IndexTestTable WHERE SalesOrderID = 65000 AND ProductID = 711 SET STATISTICS TIME OFF SET STATISTICS PROFILE OFF GO
CPU time = 0 ms, elapsed time = 1 ms. SELECT [SalesOrderDetailID] FROM [IndexTestTable] WHERE [SalesOrderID]=@1 AND [ProductID]>@2 |--Index Seek(OBJECT:([AdventureWorks].[dbo].[IndexTestTable].[IDX_NCI_SalesOrderID_ProductID]), SEEK:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderID]=[@1] AND [AdventureWorks].[dbo].[IndexTestTable].[ProductID] > CONVERT_IMPLICIT(int,[@2],0)) ORDERED FORWARD)
It does an Index SEEK, not Clustered Index SEEK. So this time it uses the NONCLUSTERED Index to SEEK the matching rows.
--////////////////////////////////////////////////////////////////////////// --// Scenario 3 : When there is no Non-Clustered Index with Included Columns --////////////////////////////////////////////////////////////////////////// SET STATISTICS PROFILE ON SET STATISTICS TIME ON SELECT ProductName FROM IndexTestTable WHERE SalesOrderID = 65000 AND ProductID = 711 SET STATISTICS TIME OFF SET STATISTICS PROFILE OFF GO
CPU time = 0 ms, elapsed time = 1 ms. SELECT [ProductName] FROM [IndexTestTable] WHERE [SalesOrderID]=@1 AND [ProductID]=@2 |--Nested Loops(Inner Join, OUTER REFERENCES:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderDetailID])) |--Index Seek(OBJECT:([AdventureWorks].[dbo].[IndexTestTable].[IDX_NCI_SalesOrderID_ProductID]), SEEK:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderID]=(65000) AND [AdventureWorks].[dbo].[IndexTestTable].[ProductID]=(711)) ORDERED FORWARD) |--Clustered Index Seek(OBJECT:([AdventureWorks].[dbo].[IndexTestTable].[IDX_UCI_SalesOrderDetailID]), SEEK:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderDetailID]= [AdventureWorks].[dbo].[IndexTestTable].[SalesOrderDetailID]) LOOKUP ORDERED FORWARD)
It still does an Index SEEK, no problem. Please note that it uses both the Indexes CLUSTERED & NONCLUSTERED.
But when you change the WHERE clause and increase the range of selected items then it does a Index SCAN.
SET STATISTICS PROFILE ON SET STATISTICS TIME ON SELECT ProductName FROM IndexTestTable WHERE SalesOrderID > 57916 AND ProductID > 900 SET STATISTICS TIME OFF SET STATISTICS PROFILE OFF GO
CPU time = 32 ms, elapsed time = 709 ms. SELECT [ProductName] FROM [IndexTestTable] WHERE [SalesOrderID]>@1 AND [ProductID]>@2 |--Clustered Index Scan(OBJECT:([AdventureWorks].[dbo].[IndexTestTable].[IDX_UCI_SalesOrderDetailID]), WHERE:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderID]>(57916) AND [AdventureWorks].[dbo].[IndexTestTable].[ProductID]>(900)))
--// Create NONCLUSTERED Covering Index on SalesOrderID, ProductID with Included Columns on SalesOrderDetailID CREATE NONCLUSTERED INDEX IDX_NCCI_IndexTestTable_SalesOrderID_ProductID ON IndexTestTable (SalesOrderID, ProductID) INCLUDE (ProductName) GO SET STATISTICS PROFILE ON SET STATISTICS TIME ON SELECT ProductName FROM IndexTestTable WHERE SalesOrderID > 57916 AND ProductID > 900 SET STATISTICS TIME OFF SET STATISTICS PROFILE OFF GO
CPU time = 0 ms, elapsed time = 654 ms. SELECT [ProductName] FROM [IndexTestTable] WHERE [SalesOrderID]>@1 AND [ProductID]>@2 |--Index Seek(OBJECT:([AdventureWorks].[dbo].[IndexTestTable].[IDX_NCCI_IndexTestTable_SalesOrderID_ProductID]), SEEK:([AdventureWorks].[dbo].[IndexTestTable].[SalesOrderID] > (57916)), WHERE:([AdventureWorks].[dbo].[IndexTestTable].[ProductID]>(900)) ORDERED FORWARD)
Now, again the query optimizer uses Index SEEK, but it uses the new NONCLUSTERED INDEX with INCLUDED column.
-- Final Cleanup DROP TABLE IndexTestTable
Related MS BOL links: http://msdn.microsoft.com/en-us/library/aa933131%28v=sql.80%29.aspx
BOL links for CLUSTERED Index vs NONCLUSTERED Index:-
http://www.devtoolshed.com/content/clustered-index-vs-non-clustered-index-sql-server
http://www.mssqlcity.com/FAQ/General/clustered_vs_nonclustered_indexes.htm
http://forums.devx.com/showthread.php?t=19018
Avoid CURSORS? why not use them optimally…
CURSORS or WHILE loops with temp-tables & counter, what do you prefer, personally and perofrmance wise?
This has been discussed in lots of forums, threads, posts and blogs previously. Many people and experts claim to use the either one and most of them are inclined to WHILE loops, and suggest to avoid CURSORS without any proof and logic.
Links: http://blog.sqlauthority.com/2008/05/21/sql-server-2005-twelve-tips-for-optimizing-sql-server-2005-query-performance/
http://www.c-sharpcorner.com/UploadFile/skumaar_mca/CursorsAlternative09012009011823AM/CursorsAlternative.aspx
http://www.sqlbook.com/SQL/Avoiding-using-SQL-Cursors-20.aspx
… and many more.
To make this more clear I tested this scenario myself and also posted this case in MSDN TSQL forum.
use [tempdb] GO create table T1 (sn int identity(1,1) primary key, data varchar(1000)) GO insert into T1 (data) values (replicate('a',1000)) GO 10000 select * from T1 create table T2 (sn int primary key, data varchar(1000)) create table T3 (sn int primary key, data varchar(1000)) -- Test the CURSOR, pull record from T1 and insert into T2 set nocount on declare @stDate datetime set @stDate = getdate() declare @sn int , @data varchar(1000) declare cr cursor FORWARD_ONLY FAST_FORWARD READ_ONLY for select sn, data from T1 open cr fetch next from cr into @sn, @data while @@fetch_status=0 begin insert into T2 select @sn, @data fetch next from cr into @sn, @data end close cr deallocate cr select Datediff(ms,@stDate,getdate()) -- Ran 4 times, it gives me 966, 413, 310, 306 ms GO -- Test the WHILE loop with counter (NO CURSOR), pull record from T1 and insert into T3 set nocount on declare @stDate datetime set @stDate = getdate() declare @ctr int set @ctr=0 while @ctr<=10000 begin insert into T3 select sn, data from T1 where sn = @ctr set @ctr = @ctr + 1 end select Datediff(ms,@stDate,getdate()) -- Ran 4 times, it gives me: 1070, 450, 503, 423 ms GO --Final Cleanup drop table T2 drop table T3 drop table T1
I ran the above code for CURSOR & WHILE loop 4 times and it gave me less execution time for CURSOR, everytime.
While using CURSORS the main thing most people miss out are the options that are available to optimize CURSOR use.
According to MS BOL they are:
DECLARE cursor_name CURSOR [ LOCAL | GLOBAL ] [ FORWARD_ONLY | SCROLL ] [ STATIC | KEYSET | DYNAMIC | FAST_FORWARD ] [ READ_ONLY | SCROLL_LOCKS | OPTIMISTIC ]
To know more check this link: http://msdn.microsoft.com/en-us/library/ms180169.aspx
This prove that CURSORs are more performant than WHILE loops. I’m open for your comments, thanks.
MSDN links:
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/e67105a6-0f4a-4a12-85b9-e7e9855279e7/
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/2978d387-fcd0-45bb-bf69-80139b6dac53
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/ba15132e-c26d-4472-92e9-845ce2cc244d/
Excel data validation with VBA macros
You have an important lead set or any other orders/financial recordset in excel and have to import the data in SQL Server. This is not a one time effort but ongoing and in future you have to deals with lot such records/data files. To import valid data in your SQL tables you want to ensure and validate at the frst step instead of correcting the data by firing SQL queries. Also to avoid this manual work to identify the incorrect/invalid data out of thousands of records automated approach would be quite helpful.
Excel not only store data but also have a power to run VBA macro code against and manipulate the data. Lets take a simple example to validate a set of Customer numbers & their insert date, shown below:
-- Test Data which is correct and expected to import in SQL table.
InsertDate CustomerNo
11/16/2009 91878552
11/16/2009 101899768
11/16/2009 101768884
11/16/2009 123456789123
11/16/2009 101768800
Columns InsertDate should be a valid date & CustomerNo should be numeric & less than 12 characters or max length equal to 12.
-- Lets tweak the test data put some invalid values under both the columns, shown below:
InsertDate CustomerNo
11/16/2009 91878552
51/96/2009 101899768
11/16/2009 1017abc84
11/16/2009 123456789123
11/16/2009 101768800
If you are on MS Office 2007 create a Macro enabled excel file with extension *.xlsm, OR if you are on 2003 and lower version *.xls file would be fine. Copy the above data in Sheet1 with incorrect values. Now press open up the VBA macro editor or press ALT+F11 keys. Now double click on Sheet1 under “Project -VBAProjects” explorer.
I wish to run the validation while someone tries to save the macro enabled excel file. So the event selected is “BeforeSave”. And the VBA macro code goes below:
Private Sub Workbook_BeforeSave(ByVal SaveAsUI As Boolean, Cancel As Boolean) Dim errorCode As Integer Dim errorMsg As String Dim numOfRecs As Integer Dim counter As Integer '--------------------- 'Start - Validate Date '--------------------- Sheet1.Activate Range("A2").Select errorCode = 0 numOfRecs = 0 Do Until ActiveCell.Value = vbNullString ' Validate Date If IsDate(ActiveCell.Value) <> True Then errorCode = 1 ActiveCell.Interior.ColorIndex = 3 'Red Else ActiveCell.Interior.ColorIndex = 2 'White End If numOfRecs = numOfRecs + 1 ActiveCell.Offset(1, 0).Select Loop If errorCode = 1 Then errorMsg = errorMsg + vbCrLf & "- Invalid Insert Date" End If '------------------- 'End - Validate Date '------------------- '-------------------------- 'Start - Validate Customer# '-------------------------- Sheet1.Activate Range("B2").Select errorCode = 0 Do Until ActiveCell.Value = vbNullString ' Check for a valid Customer Number If IsNumeric(ActiveCell.Value) <> True Or Len(ActiveCell.Value) > 12 Then errorCode = 1 ActiveCell.Interior.ColorIndex = 3 'Red Else ActiveCell.Interior.ColorIndex = 2 'White End If ActiveCell.Offset(1, 0).Select Loop If errorCode = 1 Then errorMsg = errorMsg + vbCrLf & "- Invalid Customer #" End If '------------------------ 'End - Validate Customer# '------------------------ ' Go to first cell. Range("A1").Select 'If errors then display the error msg and do not save the file otherwise save the excel file. If errorCode = 1 Then MsgBox "Workbook not saved. Following are the fields that contain invalid values." _ & vbCrLf & errorMsg & vbCrLf & vbCrLf & _ "Please correct the values highlighted in RED color.", vbCritical, "Data Validation ERROR" Cancel = True End If End Sub
Now save the macro code and close the editor. Now try to Save the Excel file, you will get the cells with invalid values highlighted in RED, as shown in the image. The excel file not be saved and when closing you will get the dialog box to save it again & again.
In order to save it correctly you need to correct the invalid values, i.e. InsertDate & CustomerNo, and then save it. The RED highlighted cells will become WHITE for the correct values.
This excel file is now very much validated and clean to import in SQL Server or any other database.
Try this code and tweak it as per your needs.
I’ve also discussed this logic in MSDN’s TSQL forum at following link: http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/09299d7d-9306-4ea4-bb29-87207572aa04
Suggestions & comments are welcome!!!