T SQL | SQL with Manoj

Avoid CURSORS? why not use them optimally…

February 7, 2011 5 comments

CURSORS or WHILE loops with temp-tables & counter, what do you prefer, personally and perofrmance wise?

This has been discussed in lots of forums, threads, posts and blogs previously. Many people and experts claim to use the either one and most of them are inclined to WHILE loops, and suggest to avoid CURSORS without any proof and logic.
Links: http://blog.sqlauthority.com/2008/05/21/sql-server-2005-twelve-tips-for-optimizing-sql-server-2005-query-performance/
http://www.c-sharpcorner.com/UploadFile/skumaar_mca/CursorsAlternative09012009011823AM/CursorsAlternative.aspx
http://www.sqlbook.com/SQL/Avoiding-using-SQL-Cursors-20.aspx
… and many more.

To make this more clear I tested this scenario myself and also posted this case in MSDN TSQL forum.

use [tempdb]
GO

create table T1 (sn int identity(1,1) primary key, data varchar(1000))
GO

insert into T1 (data)
values (replicate('a',1000))
GO 10000

select * from T1

create table T2 (sn int primary key, data varchar(1000))
create table T3 (sn int primary key, data varchar(1000))

-- Test the CURSOR, pull record from T1 and insert into T2
set nocount on

declare @stDate datetime
set @stDate = getdate()
declare @sn int , @data varchar(1000)

declare cr cursor FORWARD_ONLY FAST_FORWARD READ_ONLY
for select sn, data from T1

open cr
fetch next from cr into @sn, @data
while @@fetch_status=0
begin
	insert into T2
	select @sn, @data
	fetch next from cr into @sn, @data
end
close cr
deallocate cr

select Datediff(ms,@stDate,getdate()) -- Ran 4 times, it gives me 966, 413, 310, 306 ms
GO

-- Test the WHILE loop with counter (NO CURSOR), pull record from T1 and insert into T3
set nocount on

declare @stDate datetime
set @stDate = getdate()
declare @ctr int
set @ctr=0

while @ctr<=10000
begin
	insert into T3
	select sn, data
	from T1
	where sn = @ctr

	set @ctr = @ctr + 1
end

select Datediff(ms,@stDate,getdate()) -- Ran 4 times, it gives me: 1070, 450, 503, 423 ms
GO

--Final Cleanup
drop table T2
drop table T3
drop table T1

I ran the above code for CURSOR & WHILE loop 4 times and it gave me less execution time for CURSOR, everytime.

While using CURSORS the main thing most people miss out are the options that are available to optimize CURSOR use.
According to MS BOL they are:

DECLARE cursor_name CURSOR [ LOCAL | GLOBAL ]
     [ FORWARD_ONLY | SCROLL ]
     [ STATIC | KEYSET | DYNAMIC | FAST_FORWARD ]
     [ READ_ONLY | SCROLL_LOCKS | OPTIMISTIC ]

To know more check this link: http://msdn.microsoft.com/en-us/library/ms180169.aspx

This prove that CURSORs are more performant than WHILE loops. I’m open for your comments, thanks.

MSDN links:
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/e67105a6-0f4a-4a12-85b9-e7e9855279e7/
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/2978d387-fcd0-45bb-bf69-80139b6dac53
http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/ba15132e-c26d-4472-92e9-845ce2cc244d/

Categories: Differences, Optimization Performance Tags: Cursor, Cursor vs While loop, WHILE loop

SELECT an XML string to a table – SQL Server

January 28, 2011 6 comments

A simple way to SELECT XML string in a tabular format by using Nodes and OPENXML().

–> By using FOR XML AUTO:

USE [AdventureWorks]
GO

--// Generate an XML result-set:
SELECT TOP 5 ContactID, FirstName, LastName
FROM Person.Contact
FOR XML AUTO
GO

Output as XML String:
<Person.Contact ContactID="1" FirstName="Gustavo" LastName="Achong"/>
<Person.Contact ContactID="2" FirstName="Catherine" LastName="Abel"/>
<Person.Contact ContactID="3" FirstName="Kim" LastName="Abercrombie"/>
<Person.Contact ContactID="4" FirstName="Humberto" LastName="Acevedo"/>
<Person.Contact ContactID="5" FirstName="Pilar" LastName="Ackerman"/>

–> SELECT by using XML Nodes:

DECLARE @xml XML
SET @xml = N'
<root><Person.Contact ContactID="1" FirstName="Gustavo" LastName="Achong"/>
<Person.Contact ContactID="2" FirstName="Catherine" LastName="Abel"/>
<Person.Contact ContactID="3" FirstName="Kim" LastName="Abercrombie"/>
<Person.Contact ContactID="4" FirstName="Humberto" LastName="Acevedo"/>
<Person.Contact ContactID="5" FirstName="Pilar" LastName="Ackerman"/></root>'

SELECT
Tab.Col.value('@ContactID','int') AS ContactID,
Tab.Col.value('@FirstName','varchar(20)') AS FirstName,
Tab.Col.value('@LastName','varchar(20)') AS LastName
FROM   @xml.nodes('/root/Person.Contact') Tab(Col)
GO

Output:
ContactID    FirstName    LastName
1            Gustavo      Achong
2            Catherine    Abel
3            Kim          Abercrombie
4            Humberto     Acevedo
5            Pilar        Ackerman

–> SELECT by using OPENXML:

DECLARE @xml XML
SET @xml = N'
<root><Person.Contact ContactID="1" FirstName="Gustavo" LastName="Achong"/>
<Person.Contact ContactID="2" FirstName="Catherine" LastName="Abel"/>
<Person.Contact ContactID="3" FirstName="Kim" LastName="Abercrombie"/>
<Person.Contact ContactID="4" FirstName="Humberto" LastName="Acevedo"/>
<Person.Contact ContactID="5" FirstName="Pilar" LastName="Ackerman"/></root>'

DECLARE @docHandle int
-- Create internal representation of the XML document and return the xml-doc Handle ID
EXEC sp_xml_preparedocument @docHandle OUTPUT, @xml

select @docHandle -- 37

SELECT *
FROM OPENXML(@docHandle, N'//Person.Contact')
WITH (ContactID INT, FirstName VARCHAR(20), LastName VARCHAR(20))

Output:
ContactID    FirstName    LastName
1            Gustavo      Achong
2            Catherine    Abel
3            Kim          Abercrombie
4            Humberto     Acevedo
5            Pilar        Ackerman

-- Remove xml-doc Handle
EXEC sp_xml_removedocument @docHandle
GO

>> Check & Subscribe my [YouTube videos] on SQL Server.

Categories: SQL Tips, XML Tags: FOR XML AUTO, OPENXML, sp_xml_preparedocument, sp_xml_removedocument, xml to table

Creating “Dynamic PIVOT” scripts in SQL Server

January 25, 2011 24 comments

My previous PIVOT UNPIVOT post describes to organize data in a cross tab fashion or to transpose columns into rows and vice-versa.

This is fine only when you know how many columns you need. But when columns vary according to the query then how could you apply them on your scripts. The only way is to store them in a string variable at runtime and apply them in a dynamic SQL query, shown below.

This problem was also discussed on MSDN’s following link: http://social.msdn.microsoft.com/Forums/en/transactsql/thread/26f86fd6-6d06-4cc5-9723-9ee4685bd48a where I proposed an answer also described below:

USE [tempdb]
GO

-- Create test tables
create table table1 (number int, desc varchar(20),
location int, numberatlocation int)
create table table2 (code int, name varchar(20))

-- Insert test data
insert into table1 values (12345,'test',1000,5)
insert into table1 values (12345,'test',1001,2)
insert into table1 values (12345,'test',1002,4)
insert into table1 values (12345,'test',1003,9)
insert into table1 values (12345,'test',1004,7)

insert into table2 values (1000,'loc1')
insert into table2 values (1001,'loc2')
insert into table2 values (1002,'loc3')
insert into table2 values (1003,'loc4')
insert into table2 values (1004,'loc5')

-- Static PIVOT
select number, description, [loc1], [loc2], [loc3], [loc4], [loc5]
from (select number, desc, numberatlocation, name
from table1 join table2 on table1.location=table2.code)p
PIVOT(MAX (numberatlocation) FOR Name IN ( [loc1], [loc2], [loc3], [loc4], [loc5] )
) AS pvt
ORDER BY number

Output of Static query:
number	desc	loc1	loc2	loc3	loc4	loc5
12345	test    5       2       4       9       7

-- Dynamic PIVOT

-- Lets add one more record on both the tables to check the results
insert into table1 values (12345,'test',1005,3)
insert into table2 values (1005,'loc6')

declare @col varchar(1000)
declare @sql varchar(2000)

select @col = COALESCE(@col + ', ','') + QUOTENAME(name)
from table2

select @col -- This gives: [loc1], [loc2], [loc3], [loc4], [loc5], [loc6]

-- Now setting this @col variable in the Dynamic SQL.
set @sql = '
select number, desc, ' + @col + '
from (select number, desc, numberatlocation, name
from table1 join table2 on table1.location=table2.code)p
PIVOT(MAX (numberatlocation) FOR Name IN ( ' + @col + ' )
) AS pvt
ORDER BY number'

print @sql

exec (@sql)

Output of Dynamic query:
number	desc	loc1	loc2	loc3	loc4	loc5    loc6
12345	test    5       2       4       9       7       3

-- Final Cleanup
drop table table1
drop table table2

Categories: SQL Tips Tags: DYNAMIC PIVOT, Dynamic SQL, PIVOT

UNION vs UNION ALL | which one is faster?

January 14, 2011 Leave a comment

UNION and UNION ALL are used to combine data or record sets from two different tables. One could combine more than 2 tables. Just like JOINS, UNION combines data into a single record-set but vertically by adding rows from another table. JOINS combine data horizontally by adding columns from another table.

UNION insures you get DISTINCT records from both the tables.
UNION ALL pulls out all records from both the tables with duplicates.

Generally “UNION ALL” is considered best in performance when compared with UNION.
But you should only use “UNION ALL” when you are sure that you have distinct or no common records in both the records-sets.

For more information on this check my earlier blog post which shows the main differences and how performance differs in b/w the two: https://sqlwithmanoj.wordpress.com/2010/12/30/why-union-all-is-faster-than-union/

Categories: Differences, Optimization Performance Tags: UNION, UNION ALL, union vs union all

Why UNION ALL is faster than UNION?

December 30, 2010 9 comments

UNION and UNION ALL are used to combine data or record sets from two different tables. One could combine more than 2 tables. Just like JOINS, UNION combines data into a single record-set but vertically by adding rows from another table. JOINS combine data horizontally by adding columns from another table.

UNION insures you get DISTINCT records from both the tables.
UNION ALL pulls out all records from both the tables with duplicates.

SQL Scripts below proves why UNION ALL is faster than UNION:

-- Create First table: #tempTable1
select FirstName, LastName into #tempTable1
from Person.Contact
where ContactID <= 100 -- Create Second table: #tempTable2 select FirstName, LastName into #tempTable2 from Person.Contact where ContactID > 100 and ContactID <= 200

Comparison:

SET STATISTICS PROFILE ON

-- Using UNION - Query Cost: 35%
select * from #tempTable1
UNION
select * from #tempTable2

Query Plan for UNION:
  |--Sort(DISTINCT ORDER BY:([Union1008] ASC, [Union1009] ASC))
       |--Concatenation
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))

-- Using UNION ALL - Query Cost: 13%
select * from #tempTable1
UNION ALL
select * from #tempTable2

Query Plan for UNION ALL:
  |--Concatenation
       |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
       |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))

-- Using UNION ALL with DISTINCT - Query Cost: 52%
select DISTINCT * from #tempTable1
UNION ALL
select DISTINCT * from #tempTable2

Query Plan for UNION ALL with DISTINCT:
  |--Concatenation
       |--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable1].[FirstName] ASC, [tempdb].[dbo].[#tempTable1].[LastName] ASC))
       |    |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
       |--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable2].[FirstName] ASC, [tempdb].[dbo].[#tempTable2].[LastName] ASC))
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))

SET STATISTICS PROFILE OFF

-- Final Cleanup
drop table #tempTable1
drop table #tempTable2

The Query plan for:
– UNION ALL SELECT above shows it just does a concatenation.
– UNION SELECT above shows it does a SORT on final concatenation.
– UNION ALL with DISTINCT SELECT above shows it does a SORT of individual tables than does the final concatenation.

The Query cost with:

– UNION ALL = 13%
– UNION = 35%
– UNION ALL with DISTINCT = 52%

This proves that:

UNION ALL is faster and more optimized than UNION. But this does not mean you use UNION ALL in every scenario.
UNION is not equivalent to “UNION ALL with DISTINCT”.