Archive
UNION vs UNION ALL | which one is faster?
UNION and UNION ALL are used to combine data or record sets from two different tables. One could combine more than 2 tables. Just like JOINS, UNION combines data into a single record-set but vertically by adding rows from another table. JOINS combine data horizontally by adding columns from another table.
UNION insures you get DISTINCT records from both the tables.
UNION ALL pulls out all records from both the tables with duplicates.
Generally “UNION ALL” is considered best in performance when compared with UNION.
But you should only use “UNION ALL” when you are sure that you have distinct or no common records in both the records-sets.
For more information on this check my earlier blog post which shows the main differences and how performance differs in b/w the two: https://sqlwithmanoj.wordpress.com/2010/12/30/why-union-all-is-faster-than-union/
Why UNION ALL is faster than UNION?
UNION and UNION ALL are used to combine data or record sets from two different tables. One could combine more than 2 tables. Just like JOINS, UNION combines data into a single record-set but vertically by adding rows from another table. JOINS combine data horizontally by adding columns from another table.
UNION insures you get DISTINCT records from both the tables.
UNION ALL pulls out all records from both the tables with duplicates.
SQL Scripts below proves why UNION ALL is faster than UNION:
-- Create First table: #tempTable1 select FirstName, LastName into #tempTable1 from Person.Contact where ContactID <= 100 -- Create Second table: #tempTable2 select FirstName, LastName into #tempTable2 from Person.Contact where ContactID > 100 and ContactID <= 200
Comparison:
SET STATISTICS PROFILE ON -- Using UNION - Query Cost: 35% select * from #tempTable1 UNION select * from #tempTable2
Query Plan for UNION:
|--Sort(DISTINCT ORDER BY:([Union1008] ASC, [Union1009] ASC))
|--Concatenation
|--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
|--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))
-- Using UNION ALL - Query Cost: 13% select * from #tempTable1 UNION ALL select * from #tempTable2
Query Plan for UNION ALL:
|--Concatenation
|--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
|--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))
-- Using UNION ALL with DISTINCT - Query Cost: 52% select DISTINCT * from #tempTable1 UNION ALL select DISTINCT * from #tempTable2
Query Plan for UNION ALL with DISTINCT:
|--Concatenation
|--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable1].[FirstName] ASC, [tempdb].[dbo].[#tempTable1].[LastName] ASC))
| |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
|--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable2].[FirstName] ASC, [tempdb].[dbo].[#tempTable2].[LastName] ASC))
|--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))
SET STATISTICS PROFILE OFF -- Final Cleanup drop table #tempTable1 drop table #tempTable2
The Query plan for:
– UNION ALL SELECT above shows it just does a concatenation.
– UNION SELECT above shows it does a SORT on final concatenation.
– UNION ALL with DISTINCT SELECT above shows it does a SORT of individual tables than does the final concatenation.
The Query cost with:
– UNION ALL = 13%
– UNION = 35%
– UNION ALL with DISTINCT = 52%
This proves that:
- UNION ALL is faster and more optimized than UNION. But this does not mean you use UNION ALL in every scenario.
- UNION is not equivalent to “UNION ALL with DISTINCT”.