December | 2010 | SQL with Manoj

Why UNION ALL is faster than UNION?

December 30, 2010 9 comments

UNION and UNION ALL are used to combine data or record sets from two different tables. One could combine more than 2 tables. Just like JOINS, UNION combines data into a single record-set but vertically by adding rows from another table. JOINS combine data horizontally by adding columns from another table.

UNION insures you get DISTINCT records from both the tables.
UNION ALL pulls out all records from both the tables with duplicates.

SQL Scripts below proves why UNION ALL is faster than UNION:

-- Create First table: #tempTable1
select FirstName, LastName into #tempTable1
from Person.Contact
where ContactID <= 100 -- Create Second table: #tempTable2 select FirstName, LastName into #tempTable2 from Person.Contact where ContactID > 100 and ContactID <= 200

Comparison:

SET STATISTICS PROFILE ON

-- Using UNION - Query Cost: 35%
select * from #tempTable1
UNION
select * from #tempTable2

Query Plan for UNION:
  |--Sort(DISTINCT ORDER BY:([Union1008] ASC, [Union1009] ASC))
       |--Concatenation
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))

-- Using UNION ALL - Query Cost: 13%
select * from #tempTable1
UNION ALL
select * from #tempTable2

Query Plan for UNION ALL:
  |--Concatenation
       |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
       |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))

-- Using UNION ALL with DISTINCT - Query Cost: 52%
select DISTINCT * from #tempTable1
UNION ALL
select DISTINCT * from #tempTable2

Query Plan for UNION ALL with DISTINCT:
  |--Concatenation
       |--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable1].[FirstName] ASC, [tempdb].[dbo].[#tempTable1].[LastName] ASC))
       |    |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
       |--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable2].[FirstName] ASC, [tempdb].[dbo].[#tempTable2].[LastName] ASC))
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))

SET STATISTICS PROFILE OFF

-- Final Cleanup
drop table #tempTable1
drop table #tempTable2

The Query plan for:
– UNION ALL SELECT above shows it just does a concatenation.
– UNION SELECT above shows it does a SORT on final concatenation.
– UNION ALL with DISTINCT SELECT above shows it does a SORT of individual tables than does the final concatenation.

The Query cost with:

– UNION ALL = 13%
– UNION = 35%
– UNION ALL with DISTINCT = 52%

This proves that:

UNION ALL is faster and more optimized than UNION. But this does not mean you use UNION ALL in every scenario.
UNION is not equivalent to “UNION ALL with DISTINCT”.

ISNULL vs COALESCE – expressions/functions in SQL Server

December 23, 2010 20 comments

ISNULL & COALESCE with some common features makes them equivalent, but some features makes them work and behave differently, shown below.

– Similarity
Both can be use to build/create a CSV list as shown below:

USE [AdventureWorks]
GO

DECLARE @csv VARCHAR(2000)

SELECT @csv = ISNULL(@csv + ', ', '') + FirstName
FROM Person.Contact
WHERE ContactID <= 10
ORDER BY FirstName

select @csv

set @csv=NULL
SELECT @csv = COALESCE(@csv + ', ', '') + FirstName
FROM Person.Contact
WHERE ContactID <= 10
ORDER BY FirstName

select @csv

Both will give the same output:
Carla, Catherine, Frances, Gustavo, Humberto, Jay, Kim, Margaret, Pilar, Ronald

– Difference #1
ISNULL accepts only 2 parameters. The first parameter is checked for NULL value, if it is NULL then the second parameter is returned, otherwise it returns first parameter.
COALESCE accepts two or more parameters. One can apply 2 or as many parameters, but it returns only the first non NULL parameter, example below.

DECLARE @str1 VARCHAR(10), @str2 VARCHAR(10)

-- ISNULL() takes only 2 arguments
SELECT ISNULL(@str1, 'manoj') AS 'IS_NULL' -- manoj

-- COALESCE takes multiple arguments and returns first non-NULL argument
SELECT COALESCE(@str1, @str2, 'manoj') AS 'COALESCE' -- manoj

-- ISNULL() equivalent of COALESCE, by nesting of ISNULL()
SELECT ISNULL(@str1, ISNULL(@str2, 'manoj')) AS 'IS_NULL eqv' -- manoj

– Difference #2
ISNULL does not implicitly converts the datatype if both parameters datatype are different.
On the other side COALESCE implicitly converts the parameters datatype in order of higher precedence.

-- ISNULL Does not do Implicit conversion
select ISNULL(10, getdate()) as 'IS_NULL' -- Errors out

Error Message:
Msg 257, Level 16, State 3, Line 1
Implicit conversion from data type datetime to int is not allowed.
Use the CONVERT function to run this query.

-- COALESCE Does Implicit conversion and gets converted to higher precedence datatype.
select COALESCE(10, getdate()) as 'COALESCE' -- 1900-01-11 00:00:00.000, outputs 10 but convert it to datetime [datetime > int]
select COALESCE(getdate(),10) as 'COALESCE' -- {Current date} 2010-12-23 23:36:31.110
select COALESCE(10, 'Manoj') as 'COALESCE' -- 10 [int > varchar]
select COALESCE('Manoj',10) as 'COALESCE' -- Errors out, it does an implicit conversion, but cannot change 'Manoj' to Integer.

Error Message:
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value 'Manoj' to data type int.

– Difference #3
Similar to above point ISNULL always returns the value with datatype of first parameter.
Contrary to this, COALESCE returns the datatype value according to the precedence and datatype compatibility.

DECLARE @str VARCHAR(5)

SET @str = NULL

-- ISNULL returns truncated value after its fixed size, here 5
SELECT ISNULL(@str, 'Half Full') AS 'IS_NULL' -- Half

-- COALESCE returns full length value, returns full 12 char string
SELECT COALESCE(@str, 'Half Full') AS 'COALESCE' -- Half Full

– Difference #4
According to MS BOL, ISNULL and COALESCE though equivalent, can behave differently. An expression involving ISNULL with non-null parameters is considered to be NOT NULL, while expressions involving COALESCE with non-null parameters is considered to be NULL. Thus to index expressions involving COALESCE with non-null parameters, the computed column can be persisted using the PERSISTED column attribute.

-- ISNULL() is allowed in computed columns with Primary Key
CREATE TABLE T1 (
	col1 INT,
	col2 AS ISNULL(col1, 1) PRIMARY KEY)

-- COALESCE() is not allowed in non-persisted computed columns with Primary Key
CREATE TABLE T2 (
	col1 INT,
	col2 AS COALESCE(col1, 1) PRIMARY KEY)

Error Message:
Msg 1711, Level 16, State 1, Line 1
Cannot define PRIMARY KEY constraint on column 'col2' in table 'T2'. 
The computed column has to be persisted and not nullable.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.

-- COALESCE() is only allowed as persisted computed columns with Primary Key
CREATE TABLE T2 (
	col1 INT,
	col2 AS COALESCE(col1, 1) PERSISTED PRIMARY KEY)

-- Clean up
DROP TABLE T1
DROP TABLE T2

MSDN BOL links:
ISNULL: http://msdn.microsoft.com/en-us/library/ms184325.aspx
COALESCE: http://msdn.microsoft.com/en-us/library/ms190349.aspx
http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=ISNULL_COALESCE

Categories: Differences, SQL DB Engine Tags: COALESCE, COMPUTED COLUMNS, ISNULL, ISNULL vs COALESCE

Convert Hex to String – MSDN TSQL forum

December 22, 2010 Leave a comment

–> Question:

How can I convert the hex code ‘0x16004D616E75623232’ to string?

An online utility provides me its equivalent: http://www.string-functions.com/hex-string.aspx

But how can I do this by TSQL, CLR would be a second thought.

–> Answer:

Didn’t thought it was so simple, actually my hex string had ‘1600’ prefixed after 0x, on removing it I’m getting the expected o/p, which is:

SELECT CONVERT(varchar(64), 0x4D616E75623232, 0)

–> Answer from Hunchback:

If you are using SS 2008 or earlier, then check function CONVERT in BOL.

SELECT
  CONVERT(varbinary(64), '0x16004D616E75623232', 1),
  CONVERT(varchar(64), 0x16004D616E75623232, 1),
  CONVERT(varchar(64), 0x16004D616E75623232, 2);
GO

For lower versions you can use some tricks from Umachandar Jayachandran, link: http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/20e92e10-a0ab-4a53-a766-76f84bfd4e8c

Ref link

Categories: SQL Tips Tags: Hex to String, MSDN TSQL forum

SQL Error – UPDATE statement conflicted with the FOREIGN KEY/REFERENCE constraint

December 21, 2010 11 comments

Just replied on MSDN forum’s following post: http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/289c4ca5-6ab2-4ed6-b446-0300f5c7dc88

When tables are referenced by Foreign Key constraints it gets difficult to UPDATE the key columns.
An example shows this issue and how one can perform UPDATE on FK columns.

USE [tempdb]
GO

-- CREATE test tables
CREATE TABLE dept (d_id INT PRIMARY KEY, dept VARCHAR(50))

CREATE TABLE emp (sn INT PRIMARY KEY, ename VARCHAR(50),
	d_fk INT CONSTRAINT fk_dept_d_fk FOREIGN KEY (d_fk) REFERENCES dept(d_id))

-- INSERT test data
INSERT INTO dept
SELECT 1, 'finance' UNION
SELECT 2, 'hr'

INSERT INTO emp
SELECT 1, 'Manoj', 1 UNION
SELECT 2, 'Saurabh', 2 UNION
SELECT 3, 'Kanchan', 1 UNION
SELECT 4, 'Pooja', 2

-- SELECT the INSERTed data
SELECT * from dept

d_id	dept
1	finance
2	hr	-- DeptID needs to be updated to 3

SELECT * from emp

sn	ename	d_fk
1	Manoj	1
2	Saurabh	2	-- DeptID should also be updated here to 3
3	Kanchan	1
4	Pooja	2	-- DeptID should also be updated here to 3

-- Now lets UPDATE the FK column values
UPDATE dept SET d_id=3 WHERE d_id=2

Error Message:
Msg 547, Level 16, State 0, Line 1
The UPDATE statement conflicted with the REFERENCE constraint "fk_dept_d_fk".
The conflict occurred in database "tempdb", table "dbo.emp", column 'd_fk'.
The statement has been terminated.

UPDATE emp SET d_fk=3 WHERE d_fk=2

Error Message:
Msg 547, Level 16, State 0, Line 1
The UPDATE statement conflicted with the FOREIGN KEY constraint "fk_dept_d_fk".
The conflict occurred in database "tempdb", table "dbo.dept", column 'd_id'.
The statement has been terminated.

We can disable the FK constraints by ALTER statement before the UPDATE and enable them after the UPDATE, like:

-- Disable FK Constraint
ALTER TABLE emp NOCHECK CONSTRAINT fk_dept_d_fk

-- Perform UPDATE
UPDATE dept SET d_id=3 WHERE d_id=2
UPDATE emp SET d_fk=3 WHERE d_fk=2

-- Enable FK Constraint
ALTER TABLE emp WITH CHECK CHECK CONSTRAINT fk_dept_d_fk

-- Following SELECTS will show the new values in updated records:
SELECT * from dept

d_id	dept
1	finance
3	hr	-- Updated record

SELECT * from emp

sn	ename	d_fk
1	Manoj	1
2	Saurabh	3	-- Updated record
3	Kanchan	1
4	Pooja	3	-- Updated record

-- Final Cleanup, DROP the tables.
DROP TABLE emp
DROP TABLE dept

MSDN BOL: http://msdn.microsoft.com/en-us/library/10cetyt6(v=VS.80).aspx

Categories: SQL Errors, SQL Tips Tags: NOCHECK CONSTRAINT, WITH CHECK

CROSS APPLY vs OUTER APPLY operators in SQL Server

December 11, 2010 7 comments

In my [previous post] we learnt about UDFs, their types and implementations. UDFs can be used in queries at column level, table levels and on column definition while creating tables.

They can also be joined with other tables, but not by simple joins. They use special join-keyword called APPLY operator.

According to MS BOL an APPLY operator allows you to invoke a table-valued function for each row returned by an outer table expression of a query. The table-valued function acts as the right input and the outer table expression acts as the left input. The right input is evaluated for each row from the left input and the rows produced are combined for the final output. The list of columns produced by the APPLY operator is the set of columns in the left input followed by the list of columns returned by the right input.

–> There are 2 forms of APPLY operators:

1. CROSS APPLY acts as INNER JOIN, returns only rows from the outer table that produce a result set from the table-valued function.

2. OUTER APPLY acts as OUTER JOIN, returns both rows that produce a result set, and rows that do not, with NULL values in the columns produced by the table-valued function.

Lets take 2 tables: Person.Contact & Sales.SalesOrderHeader

SELECT * FROM Person.Contact WHERE ContactID = 100
SELECT * FROM Sales.SalesOrderHeader WHERE ContactID = 100

You have a UDF that returns Sales Order Details of a Particular Contact. Now you want to use that UDF to know what all Contacts have Ordered what with other details, let’s see:

–> First creating a UDF to test with JOINS & APPLY:

--// Create Multiline UserDefinedFunction [dbo].[ufn_mtv_GetContactSales]
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[ufn_mtv_GetContactSales](@ContactID int)
RETURNS @retSalesInfo TABLE (
    [ContactID] INT NOT NULL,
	[SalesOrderID] INT NULL,
	[ProductID] INT NULL,
	[Name] NVARCHAR(50) NULL,
    [OrderDate] DATETIME NULL,
    [DueDate] DATETIME NULL,
    [ShipDate] DATETIME NULL,
	[TotalDue] MONEY NULL,
    [Status] TINYINT NULL,
	[SalesPersonID] INT NULL)
AS
BEGIN
    IF @ContactID IS NOT NULL
    BEGIN
        INSERT @retSalesInfo
		SELECT h.[ContactID], h.[SalesOrderID], p.[ProductID], p.[Name], h.[OrderDate], h.[DueDate],
			   h.[ShipDate], h.[TotalDue], h.[Status], h.[SalesPersonID]
		FROM Sales.SalesOrderHeader AS h
		JOIN Sales.SalesOrderDetail AS d ON d.SalesOrderID = h.SalesOrderID
		JOIN Production.Product AS p ON p.ProductID = d.ProductID
		WHERE ContactID = @ContactID
    END
	-- Return the recordsets
    RETURN
END

--// Test the UDF
SELECT * FROM dbo.ufn_mtv_GetContactSales(100)

–> Trying to JOIN UDF with a table, problem is you need to apply a parameter and it can’t be a column, but a value:

--// UDF with JOIN, try it out!!!
SELECT *
FROM Person.Contact c
JOIN dbo.ufn_mtv_GetContactSales(100) f -- You will have to pass the ContactID parameter, so no use of joining.
ON f.ContactID = c.ContactID

–> Testing with CROSS APPLY:

--// CROSS APPLY -- 279 records (All matched records, 1 missing out of 280)
SELECT c.[ContactID], c.[FirstName], c.[LastName], c.[EmailAddress], c.[Phone], s.*
FROM Person.Contact AS c
CROSS APPLY ufn_mtv_GetContactSales(c.ContactID) AS s
WHERE c.ContactID between 100 and 105

-- Same equivalent query without cross apply, using JOINs -- 279 records
SELECT c.[ContactID], c.[FirstName], c.[LastName], c.[EmailAddress], c.[Phone],
	   h.[ContactID], h.[SalesOrderID], p.[ProductID], p.[Name], h.[OrderDate], h.[DueDate],
	   h.[ShipDate], h.[TotalDue], h.[Status], h.[SalesPersonID]
FROM Person.Contact AS c
JOIN Sales.SalesOrderHeader AS h ON c.ContactID = h.ContactID
JOIN Sales.SalesOrderDetail AS d ON d.SalesOrderID = h.SalesOrderID
JOIN Production.Product AS p ON p.ProductID = d.ProductID
WHERE c.ContactID between 100 and 105

–> Testing with OUTER APPLY:

--// OUTER APPLY -- 280 records (All 280 records with 1 not matched)
SELECT c.[ContactID], c.[FirstName], c.[LastName], c.[EmailAddress], c.[Phone], s.*
FROM Person.Contact AS c
OUTER APPLY ufn_mtv_GetContactSales(c.ContactID) AS s
WHERE c.ContactID between 100 and 105

-- Same equivalent query without OUTER APPLY, using LEFT JOINs -- 280 records
SELECT c.[ContactID], c.[FirstName], c.[LastName], c.[EmailAddress], c.[Phone],
	   h.[ContactID], h.[SalesOrderID], p.[ProductID], p.[Name], h.[OrderDate], h.[DueDate],
	   h.[ShipDate], h.[TotalDue], h.[Status], h.[SalesPersonID]
FROM Person.Contact AS c
LEFT JOIN Sales.SalesOrderHeader AS h ON c.ContactID = h.ContactID
LEFT JOIN Sales.SalesOrderDetail AS d ON d.SalesOrderID = h.SalesOrderID
LEFT JOIN Production.Product AS p ON p.ProductID = d.ProductID
WHERE c.ContactID between 100 and 105

>> Check & Subscribe my [YouTube videos] on SQL Server.

Categories: Differences Tags: APPLY Operator, CROSS APPLY, OUTER APPLY

Older Entries