Archive

Archive for May, 2015

Reading XML data in SQL Server – MSDN TSQL forum

May 26, 2015 Leave a comment

–> Question:

I have a table with a column with data type XML.

I don’t know the contents of the XML structure etc and I need to extract it

Please suggest.
 

–> My Answer:

Check these blog posts on how to query and work with XML data in SQL:

Convert a table to XML and back to tabular format

Query XML data in tabulat format

Query Nested XML

Using APPLY operator

Read XML from a file

XML with Namespaces

Export XML to a file

All XML posts
 

Ref Link.


SQL Myth | Nested Transactions in SQL Server and hidden secrets

May 26, 2015 Leave a comment

There is no concept of Nested Transactions in SQL Server, Period.

There can be workarounds but Nested Transactions is not out of the box.

–> In my [previous post] we saw a scenario where you have a Nested Transaction. And we saw issues with Rolling back the inner Transaction and handling them gracefully with the Outer Transactions, which typically looked like this:

Nested Trsansaction

 
–> Here we will see how the nested Transactions behave internally by executing these SQL statements in chunks.

1. First of all we will create a sample table and execute the code till BEGIN outerTran section:

USE [tempdb]
GO

CREATE TABLE dbo.TranTest (ID INT)
GO

CHECKPOINT
GO

-- Outer Transaction - BEGIN
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 0, 0
BEGIN TRANSACTION outerTran

INSERT INTO dbo.TranTest values(1)
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 1, 1

– This was a simple one, one row got inserted with Transaction Count = 1, but its not committed yet.

– Let’s see the Transaction status by using the undocumented function fn_dblog(), how these are tracked in the DB Engine:

SELECT Operation, [Transaction ID], Description, Context
FROM fn_dblog(NULL, NULL) WHERE [Transaction ID] IN (
	SELECT [Transaction ID] 
	FROM fn_dblog(NULL, NULL) 
	WHERE [Description] like '%Tran%')

Nested Trsansaction fb_dblog 01

As you can see in the output above:
– The Outer BEGIN TRANSACTION statement is logged as LOP_BEGIN_XACT Operation and Description = ‘outerTran;0x01…’ for the outer Transaction.
– and INSERT statement is logged as LOP_INSERT_ROWS operation.
 

2. Now let’s execute the next section with the inner Transaction:

	-- Inner Transaction
	BEGIN TRANSACTION innerTran

	INSERT INTO dbo.TranTest values(2)
	SELECT @@TRANCOUNT AS 'innerTran Begin', count(*) from dbo.TranTest -- 2, 2

	COMMIT TRANSACTION innerTran
	SELECT @@TRANCOUNT AS 'innerTran Rollback', count(*) from dbo.TranTest -- 1, 2

– The first SELECT statement before the COMMIT statement gives count of two for the inserted rows in the table, with Transaction Count = 2.
– And after COMMIT Transaction the second SELECT statement gives Transaction count = 1.

– Let’s execute the same function and see the Transaction status:

SELECT Operation, [Transaction ID], Description, Context
FROM fn_dblog(NULL, NULL) WHERE [Transaction ID] IN (
	SELECT [Transaction ID] 
	FROM fn_dblog(NULL, NULL) 
	WHERE [Description] like '%Tran%')

Nested Trsansaction fb_dblog 02

– Here we don’t see any row for the inner Transaction with LOP_BEGIN_XACT Operation.
– A separate row is logged for the second INSERT statement as LOP_INSERT_ROWS operation with the same Transaction ID = ‘0000:00000992’.
Thus, this inner Transaction points to the Outer Transaction internally.
 

–> Now we will see what happens when we COMMIT or ROLLBACK the outer Transaction:

3.a. So, let’s COMMIT the outer Transaction first:

-- Outer Transaction - COMMIT
COMMIT TRANSACTION outerTran
SELECT @@TRANCOUNT AS 'outerTran Commit', count(*) from dbo.TranTest -- 0, 2
GO

SELECT * FROM dbo.TranTest -- 2
GO

– After the COMMIT statement the second SELECT statement gives Transaction count = 0, and there is no active Transaction left.
– And the final SELECT lists the 2 records inserted in the outer & inner Transactions.

Now again let’s execute the same function and see the Transaction status:

SELECT Operation, [Transaction ID], Description, Context
FROM fn_dblog(NULL, NULL) WHERE [Transaction ID] IN (
	SELECT [Transaction ID] 
	FROM fn_dblog(NULL, NULL) 
	WHERE [Description] like '%Tran%')

Nested Trsansaction fb_dblog 03a
– When COMMITTING the Outer Transaction it is logged as LOP_COMMIT_XACT Operation with the same Transaction ID = ‘0000:00000992’.
 

3.b. In case of ROLLBACK lets see what happens: You will need to execute all the SQL statements in Step 1 to 3 again.

-- Outer Transaction - COMMIT
COMMIT TRANSACTION outerTran
SELECT @@TRANCOUNT AS 'outerTran Commit', count(*) from dbo.TranTest -- 0, 2
GO

SELECT * FROM dbo.TranTest -- 2
GO

DROP TABLE dbo.TranTest
GO

Now again let’s execute the same function and see the Transaction status:

SELECT Operation, [Transaction ID], Description, Context
FROM fn_dblog(NULL, NULL) WHERE [Transaction ID] IN (
	SELECT [Transaction ID] 
	FROM fn_dblog(NULL, NULL) 
	WHERE [Description] like '%Tran%')

Nested Trsansaction fb_dblog 03b
As the outer Transaction is ROLLEDBACKED the inner Transaction also also gets Rollebacked, and thus you can see:
– two DELETE logged rows for the two INSERTed rows above, with LOP_DELETE_ROWS Operation and Description = COMPENSATION.
– and final ROLLBACK log with LOP_ABORT_XACT Operation with the same Transaction IDs.

 
The above exercise shows us that SQL Server only tracks the outermost Transaction, and do not bother about the inner Transactions.

–> So what’s the Hidden Secret?

1. First one is what we saw above, no Nested Transactions.

2. COMMIT TRANSACTION has an option to apply the Transaction name, but the DB Engine simply ignores it and points to the previous BEGIN TRANSACTION statement. Thus while issuing a COMMIT TRANSACTION referencing the name of an outer transaction when there are outstanding inner transactions it only decrements the @@TRANCOUNT value by 1.

3. ROLLBACK TRANSACTION also have an option to apply the Transaction name, but you can only apply the outermost Transaction name in case of Nested Transactions. While using ROLLBACK in inner Transactions you have to use either just ROLLBACK TRANSACTION or ROLLBACK TRANSACTION SavePoint_name, only if the inner transaction are created with SAVE TRANSACTION option instead of BEGIN TRANSACTION.

4. ROLLBACK TRANSACTION SavePoint_name does not decrement @@TRANCOUNT value.

 
Thus it is advised be careful while using Nested Transactions, ROLLBACKS and SavePoints, or just simply ignore them.


SQL error – Msg 6401, Level 16, State 1 – Cannot roll back Transaction. No transaction or savepoint of that name was found

May 25, 2015 1 comment

Yesterday in a SQL forum I saw a question regarding this error, and the person who had asked this question was finding it difficult to fix it.

He had a scenario where the Outer Transaction has an Inner Transaction, as shown below. And he was trying to ROLLBACK the inner Transaction and was getting an error for this similar SQL batch that I’ve tried to simulated here:

–> I’ve created a sample table let’s executed the whole below:

USE [tempdb]
GO

CREATE TABLE dbo.TranTest (ID INT)
GO

-- Outer Transaction - BEGIN
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 0, 0
BEGIN TRANSACTION outerTran

INSERT INTO dbo.TranTest values(1)
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 1, 1

	-- Inner Transaction
	BEGIN TRANSACTION innerTran

	INSERT INTO dbo.TranTest values(2)
	SELECT @@TRANCOUNT AS 'innerTran Begin', count(*) from dbo.TranTest -- 2, 2

	ROLLBACK TRANSACTION innerTran
	SELECT @@TRANCOUNT AS 'innerTran Rollback', count(*) from dbo.TranTest -- 2, 2

-- Outer Transaction - COMMIT
COMMIT TRANSACTION outerTran
SELECT @@TRANCOUNT AS 'outerTran Commit', count(*) from dbo.TranTest -- 1, 2
GO

SELECT * FROM dbo.TranTest -- 2
GO

– You can see the count of rows inserted in the comments with every SELECT statement.
– But the ROLLBACK TRANSACTION innerTran statement at Line 20 fails with following error:
Msg 6401, Level 16, State 1, Line 20
Cannot roll back innerTran1. No transaction or savepoint of that name was found.

– And thus the last SELECT statement still gives Transaction count = 2, as this Transaction is not Rollbacked and neither committed.

–> NOTE: With the ROLLBACK TRANSACTION statement as per MS BoL you can only specify the name of the outermost BEGIN TRANSACTION statement. So, how can we make this code fail-proof?
 

There are 2 ways to solve this:
1. Just use ROLLBACK TRANSACTION without the name of any transaction.
2. Use SAVE TRANSACTION instead of BEGIN TRANSACTION for inner Transactions.
 

Option #1: Using just ROLLBACK TRANSACTION statement:

-- Outer Transaction - BEGIN
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 0, 0
BEGIN TRANSACTION outerTran

INSERT INTO dbo.TranTest values(1)
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 1, 1

	-- Inner Transaction
	BEGIN TRANSACTION innerTran

	INSERT INTO dbo.TranTest values(2)
	SELECT @@TRANCOUNT AS 'innerTran Begin', count(*) from dbo.TranTest -- 2, 2

	ROLLBACK TRANSACTION -- here, removed inner Transaction name
	SELECT @@TRANCOUNT AS 'innerTran Rollback', count(*) from dbo.TranTest -- 0, 0

-- Outer Transaction - COMMIT
COMMIT TRANSACTION outerTran
SELECT @@TRANCOUNT AS 'outerTran Commit', count(*) from dbo.TranTest -- 1, 3
GO

SELECT * FROM dbo.TranTest -- 0
GO

– This ROLLBACK statement rollbacks the whole Transaction resulting no rows in dbo.TransTest table.
– But the COMMIT TRANSACTION outerTran statement at Line 20 fails with following error:
Msg 3902, Level 16, State 1, Line 24
The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.

 

Option #2: Using SAVE TRANSACTION instead of BEGIN TRANSACTION

-- Outer Transaction - BEGIN
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 0, 0
BEGIN TRANSACTION outerTran

INSERT INTO dbo.TranTest values(1)
SELECT @@TRANCOUNT AS 'outerTran Begin', count(*) from dbo.TranTest -- 1, 1

	-- Inner Transaction
	SAVE TRANSACTION innerTran -- here, created a SavePoint

	INSERT INTO dbo.TranTest values(2)
	SELECT @@TRANCOUNT AS 'innerTran Begin', count(*) from dbo.TranTest -- 1, 2

	ROLLBACK TRANSACTION innerTran -- Rolling back a SavePoint
	SELECT @@TRANCOUNT AS 'innerTran Rollback', count(*) from dbo.TranTest -- 1, 1

-- Outer Transaction - COMMIT
COMMIT TRANSACTION outerTran
SELECT @@TRANCOUNT AS 'outerTran Commit', count(*) from dbo.TranTest -- 0, 1
GO

SELECT * FROM dbo.TranTest -- 1
GO

As you can see that we have converted the Inner Transaction to a Save Point, as as per the error we’ve taken care of this with option #2.

In this case a SavePoint was created and you can Rollback a SavePoint inside the outer Transaction. Thus the outer transaction was finally committed with table having value “1” and inner SavePoint was Rollbacked with row having value “2”.

COMMIT vs ROLLBACK


SQL Myth | Primary Key (PK) always creates Clustered Index

May 24, 2015 4 comments

… and this a default nature, which can be overrided to create a PK with a Non-Clustered Index instead of a Clustered one.

– What happens when a Primary Key is created on a table?
– If I have a PK on a table will the table be a Heap or not?
– Can I create a PK with a Non-Clustered Index? (this is a big hint)

While Interviewing candidates I’ve confronted them with these type of question, and very few were able to answer these correctly.

–> So, here we will see the default behavior of Primary Keys and how we can override it:

–> Having the “PRIMARY KEY” option inline with the column name: While creating a new table when you specify a “PRIMARY KEY” option inline with the Key Column, by-default it creates a Clustered Index on that table with a PK Constraint on that column.

USE [tempdb]
GO

CREATE TABLE dbo.Employee (
	EmpID		 INT NOT NULL PRIMARY KEY, -- here
	EmpLogInID 	 VARCHAR(255) NULL,
	EmpFirstName VARCHAR(255) NOT NULL,
	EmpLastName	 VARCHAR(255) NOT NULL,
	Gender		 BIT NOT NULL,
	JobTitle	 VARCHAR(255) NULL,
	BOD			 DATETIME NULL,
	DOJ			 DATETIME NULL,
	DeptID		 INT NOT NULL
)
GO

sp_help 'dbo.Employee'
GO

PK_ByDefault
 
The above image shows:
– a Unique Clustered Index created on the [EmpID] column with a name automatically suggested by the DB-engine, and
– a PK Constraint created on the same column.
 


 
–> Overriding this behavior by having a “CONSTRAINT” option: Here we will not create the PK inline with the Key Column [EmpID]. But we will:
– have a separate PK constraint created with Non-Clustered Index for [EmpID] column, and
– an another SQL statement to create a Clustered Index on the [EmpLogInID] column.

DROP TABLE dbo.Employee
GO

CREATE TABLE dbo.Employee (
	EmpID		 INT NOT NULL,
	EmpLogInID 	 VARCHAR(255) NULL,
	EmpFirstName VARCHAR(255) NOT NULL,
	EmpLastName	 VARCHAR(255) NOT NULL,
	Gender		 BIT NOT NULL,
	JobTitle	 VARCHAR(255) NULL,
	BOD			 DATETIME NULL,
	DOJ			 DATETIME NULL,
	DeptID		 INT NOT NULL
	CONSTRAINT [PK_Employee_EmpID] PRIMARY KEY NONCLUSTERED (EmpID ASC) -- here
)
GO

-- Creating the Clustered Index separately on an other column:
CREATE CLUSTERED INDEX [CI_Employee_EmpLogInID] ON dbo.Employee(EmpLogInID ASC)
GO

sp_help 'dbo.Employee'
GO

PK_Override
 
The above image shows:
– a Clustered Index (Non-Unique) created on the [EmpLogInID] column with a name we provided,
– a Non-Clustered Index (Unique) created on the [EmpID] column, and
– a PK Constraint created on the [EmpID] column with a name we provided.
 

So, it is advised to choose your PK & Clustered/Non-Clustered index wisely based upon a proper and justified Business logic. Please do not consider this as a Use Case, but just an example on how to deal with PKs & Indexes.
 

Check the video on Primary Keys:

PK Constraint


SQL Basics – Difference between WHERE, GROUP BY and HAVING clause

May 23, 2015 6 comments

All these three Clauses are a part/extensions of a SQL Query, are used to Filter, Group & re-Filter rows returned by a Query respectively, and are optional. Being Optional they play very crucial role while Querying a database.

–> Here is the logical sequence of execution of these clauses:

1. WHERE clause specifies search conditions for the rows returned by the Query and limits rows to a meaningful set.

2. GROUP BY clause works on the rows returned by the previous step #1. This clause summaries identical rows into a single/distinct group and returns a single row with the summary for each group, by using appropriate Aggregate function in the SELECT list, like COUNT(), SUM(), MIN(), MAX(), AVG(), etc.

3. HAVING clause works as a Filter on top of the Grouped rows returned by the previous step #2. This clause cannot be replaced by a WHERE clause and vice-versa.

As these clauses are optional thus a minimal SQL Query looks like this:

SELECT *
FROM [Sales].[SalesOrderHeader]

This Query returns around 32k (thousand) rows form SalesOrderHeader table. Thus, if somebody wants to do some analysis on this big row-set it would be very difficult and time consuming for him.

–> Use Case: Let’s say a Sales department wants to get a list of such Customers who bought more number of items last year, so that they can sell more some stuff to them this year. How they will go ahead?

1. Using WHERE clause: First of all they will need to apply filter on above ~32k rows and get list of Orders that were made last year (i.e. in 2014) to limit the row-set, like:

SELECT *
FROM [Sales].[SalesOrderHeader]
WHERE OrderDate >= '2014-01-01 00:00:00.000'
AND OrderDate < '2015-01-01 00:00:00.000'

This Query still gives ~12k records and its still difficult to identify such Customers who have more orders.

2. Using GROUP BY clause: Here we need to group the Customers with their number of Orders, like:

SELECT CustomerID, COUNT(*) AS OrderNos
FROM [Sales].[SalesOrderHeader]
WHERE OrderDate >= '2014-01-01 00:00:00.000'
AND OrderDate < '2015-01-01 00:00:00.000'
GROUP BY CustomerID

GROUP_BY clause
This query still returns ~10k records, and I’ve go through the entire list of records to identify such records. Is there any way where I can still filter out the unwanted records with lesser count?

3. USING HAVING clause: This will works on top of GROUP BY clause to filter the grouped records onCOUNT(*) AS OrderNos column values (like a WHERE clause), like:

SELECT CustomerID, COUNT(*) AS OrderNos
FROM [Sales].[SalesOrderHeader]
WHERE OrderDate >= '2014-01-01 00:00:00.000'
AND OrderDate < '2015-01-01 00:00:00.000'
GROUP BY CustomerID
HAVING COUNT(*) > 10

HAVING clause

Thus, by using all these these clauses we can reduce and narrow down the row-set to do some quick analysis.

Check this video tutorial on WHERE clause and difference with GROUP BY & HAVING clause.