Archive
[NOT] IN, [NOT] EXISTS, JOINS with NULL values
Using NOT IN could be dangerous and may result to incorrect results at times.
Let’s check this by a step by step example:
use [tempdb] go -- Create test tables and fill test data: CREATE TABLE [dbo].TestTable(AID int NOT NULL) INSERT INTO TestTable(AID) values (1) INSERT INTO TestTable(AID) values (2) INSERT INTO TestTable(AID) values (3) INSERT INTO TestTable(AID) values (4) go create TABLE TableWithNull(BID int NULL DEFAULT(NULL)) INSERT INTO TableWithNull(BID) VALUES(NULL) INSERT INTO TableWithNull(BID) VALUES(0) INSERT INTO TableWithNull(BID) VALUES(1) INSERT INTO TableWithNull(BID) VALUES(2) GO -- Check inserted records: SELECT * FROM TableWithNull -- NULL, 0, 1, 2 select * from TestTable -- 1, 2, 3, 4
–> Using IN & NOT IN
select * from TestTable where AID in (SELECT BID FROM TableWithNull) -- 1, 2 select * from TestTable where AID not in (SELECT BID FROM TableWithNull) -- 0 rows (expected 3, 4)
SQL is a set-based language, any set containing NULL value makes it whole as NULL. Because NULL is unknown and anything added to it also becomes unknown. Thus above SQL statement results 0 rows. To make it work you have to treat the NULL and add a WHERE clause or use EXISTS instead of IN shown below:
select * from TestTable where AID not in (SELECT BID FROM TableWithNull where BID is not null) -- 3, 4 (expected result) go
–> Using EXISTS & NOT EXISTS
select * from TestTable A where exists (SELECT * FROM TableWithNull B where A.AID = B.BID) -- 1, 2 select * from TestTable A where not exists (SELECT * FROM TableWithNull B where A.AID = B.BID) -- 3, 4 go
NOTE: EXISTS keyword is a good way to check sub-queries, this makes them co-related sub-queries. With EXISTS you don’t even need to provide any column name, just use “SELECT *”, as it does not use the SELECT list at all.
With IN you can only compare one column, but with EXISTS you can compare multiple columns within outer & inner queries (sub-queries or derived queries).
–> Using JOINS in place of IN() & EXISTS()
select A.* -- 1, 2 from TestTable A JOIN TableWithNull B ON A.AID = B.BID select A.* -- 3, 4 from TestTable A LEFT JOIN TableWithNull B ON A.AID = B.BID WHERE B.BID IS NULL go
–> Here’s another catch, weird behavior:
select * from TestTable where AID not in (SELECT AID FROM TableWithNull) -- Query works even when AID column is not in [TableWithNULL] table. -- [AID] column does not belong to [TableWithNull] table, but the query still work and won't give any error. -- So its always adviced to use table alias, shown below: select * from TestTable A where A.AID not in (SELECT B.AID FROM TableWithNull B) -- Invalid column name 'AID'.
NOTE: So always try to provide ALIAS you your tables and use it with COLUMN names in your queries.
-- Final Cleanup drop table TableWithNull drop table TestTable go
Physical Join vs Logical Join in SQL Server
Most of us know about JOINS in SQL Server and their types. But do we really know how they are interpreted in SQL Server internally. Today I found lot of informative, interesting and important sources regarding Logical and Physical joins in SQL Server (links below).
–> Classifying JOINS mainly into 2 types:
1. Logical Joins: These joins are simple joins that we apply in our SQL queries, like INNER JOIN, RIGHT/LEFT OUTER JOIN, CROSS JOIN, OUTER APPLY, etc.
2. Physical Joins: These are the joins that users don’t use/write in their SQL queries. Instead these are implemented inside SQL Server engine as operators or algorithms to implement the Logical Joins. Their types are Nested Loop, Merge and Hash.
For a particular SQL query when you try to view an Estimated Execution Plan or execute a query by selecting Actual Execution Plan, you can clearly see these Physical Joins under the Execution Plan tab in SSMS.
Reference for Logical Joins from one of my old posts: https://sqlwithmanoj.wordpress.com/2009/03/12/sql-server-joins-and-types
–> Logical Joins:
– Inner/Outer/Cross: https://blogs.msdn.microsoft.com/craigfr/2006/07/19/introduction-to-joins/
–> Physical Joins:
– Nested Loop Joins: https://blogs.msdn.microsoft.com/craigfr/2006/07/26/nested-loops-join/
– Merge Joins: https://blogs.msdn.microsoft.com/craigfr/2006/08/03/merge-join/
– Hash Joins: https://blogs.msdn.microsoft.com/craigfr/2006/08/10/hash-join/
Convert multiple Rows into a single column
Just replied an answer on a SQL forum (http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=148860), so thought to post this as well for future reference.
How will you output the following record set:
123456788 City 399.99
123456788 County 499.99
123456788 Flood 299.99
123456788 Hazzard 199.99
123456789 City 333.99
123456789 County 444.99
123456789 Flood 222.99
123456789 Hazzard 111.99
... into following pattern:
id Hazzard Flood City County
123456788 199.99 299.99 399.99 499.99
123456789 111.99 222.99 333.99 444.99
The following query will do it and convert the rows into columns:
select 123456788 as id, 'Hazzard' as des, 199.99 as val into #tempTable union select 123456788, 'Flood', 299.99 union select 123456788, 'City', 399.99 union select 123456788, 'County', 499.99 union select 123456789, 'Hazzard', 111.99 union select 123456789, 'Flood', 222.99 union select 123456789, 'City', 333.99 union select 123456789, 'County', 444.99 select * from #tempTable select a.id as id, a.val as Hazzard, b.val as Flood, c.val as City, d.val as County from #tempTable a,#tempTable b,#tempTable c, #tempTable d where a.id=b.id and b.id=c.id and c.id=d.id and a.des='Hazzard' and b.des='Flood' and c.des='City' and d.des='County' drop table #tempTable
The above select code is with a simple old style where clause with multiple tables joining. The code could be also converted to self-joins.
DB Basics – SQL Server JOINS and Types
JOIN clause in SQL Server is used to combine records and create a new record set from two tables based upon the relationship between them. The relationship is established by JOINing common columns with the ON clause from both the tables and returning only required columns from both the tables.
JOIN clause is specified with the FROM clause. Clauses like AND, WHERE and/or HAVING can also be used to filter the rows selected by the JOIN clause.
–> A JOIN table operator operates on two input tables. The three fundamental types of joins are CROSS JOIN, INNER JOIN, and OUTER JOINS. These three types of joins differ in how they apply their logical query processing phases; each type applies a different set of phases:
– A CROSS JOIN applies only one phase — Cartesian Product.
– An INNER JOIN applies two phases — Cartesian Product and Filter.
– An OUTER JOIN applies three phases — Cartesian Product, Filter, and Add Outer Rows.
–> Here is a pictorial representation of various types JOINs you can create in T-SQL:
–> Joins can be categorized as:
1. CROSS JOINs: Cross Joins return all rows from the Left table. Each row from the Left table is combined with all rows from the Right table. Cross Joins are also called Cartesian products.
2. INNER JOIN: (the typical Join operation, which uses some comparison operator like = or ). These include equi-joins and natural joins.
Inner Joins use a comparison operator to match rows from two tables based on the values in common columns from each table.
3. OUTER JOIN: Outer joins can be a Left, a Right, or Full Outer Join.
Outer joins are specified with one of the following sets of keywords when they are specified in the FROM clause:
3.a. LEFT JOIN or LEFT OUTER JOIN: The result set of a Left Outer Join includes all the rows from the left table specified in the LEFT OUTER clause, not just the ones in which the joined columns match. When a row in the left table has no matching rows in the right table, the associated result set row contains null values for all select list columns coming from the right table.
3.b. RIGHT JOIN or RIGHT OUTER JOIN: A Right Outer Join is the reverse of a Left Outer Join. All rows from the right table are returned. Null values are returned for the left table any time a right table row has no matching row in the left table.
3.c. FULL JOIN or FULL OUTER JOIN: A Full Outer Join returns all rows in both the Left and Right tables. Any time a row has no match in the other table, the select list columns from the other table contain null values. When there is a match between the tables, the entire result set row contains data values from the base tables.
>> Check & Subscribe my [YouTube videos] on SQL Server.