October | 2011 | SQL with Manoj

Using DML with XML – UPDATE & DELETE (SQL Server)

October 31, 2011 Leave a comment

In my previous post we saw how we can INSERT records in a XML string. On this post we”ll see the UPDATE & DELETE operations by using same xml.modify() method.

DECLARE @xml XML

-- Taking the sample XML string we created in my previous post:
SET @xml=N'&amp;lt;Root&amp;gt;
  &amp;lt;stud id=&amp;quot;0&amp;quot;&amp;gt;
    &amp;lt;fname&amp;gt;Kanchan&amp;lt;/fname&amp;gt;
    &amp;lt;lname&amp;gt;Pandey&amp;lt;/lname&amp;gt;
    &amp;lt;class&amp;gt;10&amp;lt;/class&amp;gt;
    &amp;lt;marks&amp;gt;90.5&amp;lt;/marks&amp;gt;
  &amp;lt;/stud&amp;gt;
  &amp;lt;stud id=&amp;quot;1&amp;quot;&amp;gt;
    &amp;lt;fname&amp;gt;Manoj&amp;lt;/fname&amp;gt;
    &amp;lt;lname&amp;gt;Pandey&amp;lt;/lname&amp;gt;
    &amp;lt;class&amp;gt;10&amp;lt;/class&amp;gt;
    &amp;lt;marks&amp;gt;80.5&amp;lt;/marks&amp;gt;
  &amp;lt;/stud&amp;gt;
  &amp;lt;stud id=&amp;quot;2&amp;quot;&amp;gt;
    &amp;lt;fname&amp;gt;Saurabh&amp;lt;/fname&amp;gt;
    &amp;lt;lname&amp;gt;Sharma&amp;lt;/lname&amp;gt;
    &amp;lt;class&amp;gt;11&amp;lt;/class&amp;gt;
    &amp;lt;marks&amp;gt;82.7&amp;lt;/marks&amp;gt;
  &amp;lt;/stud&amp;gt;
  &amp;lt;stud id=&amp;quot;3&amp;quot;&amp;gt;
    &amp;lt;fname&amp;gt;Rajesh&amp;lt;/fname&amp;gt;
    &amp;lt;lname&amp;gt;Shah&amp;lt;/lname&amp;gt;
    &amp;lt;class&amp;gt;11&amp;lt;/class&amp;gt;
    &amp;lt;marks&amp;gt;70.3&amp;lt;/marks&amp;gt;
  &amp;lt;/stud&amp;gt;
&amp;lt;/Root&amp;gt;'

> UPDATE marks of Manoj to 78.2 (currently 80.5):

SET @xml.modify('
	replace value of (/Root/stud[@id=&amp;quot;1&amp;quot;]/marks/text())[1]
	with &amp;quot;78.2&amp;quot;')

SELECT @xml

> UPDATE marks of Rajesh to 78.7 stored in a variable:

DECLARE @marks FLOAT
SET @marks = 78.7

SET @xml.modify('
	replace value of (/Root/stud[@id=&amp;quot;3&amp;quot;]/marks/text())[1]
	with sql:variable(&amp;quot;@marks&amp;quot;)')

SELECT @xml

> DELETE Saurabh record where id = 2:

SET @xml.modify('
	delete (/Root/stud[@id=&amp;quot;2&amp;quot;])')

SELECT @xml

Final XML:

Categories: XML Tags: XML, XML DELETE, XML DML, XML UPDATE

Save XML in a file | Read XML from a file (in SQL Server)

October 20, 2011 6 comments

USE [AdventureWorks]
GO

-- Save XML records to a file:
DECLARE @fileName VARCHAR(50)

DECLARE @sqlStr VARCHAR(1000)
DECLARE @sqlCmd VARCHAR(1000)

SET @fileName = 'E:\SQL_Queries\test.xml'
SET @sqlStr = 'select TOP 50 [ContactID], [Title], [FirstName], [MiddleName], [LastName], [Suffix], [EmailAddress], [Phone] from AdventureWorks.Person.Contact FOR XML PATH(''Contact''), ROOT(''Person.Contact'')'

SET @sqlCmd = 'bcp "' + @sqlStr + '" queryout ' + @fileName + ' -w -T'

EXEC xp_cmdshell @sqlCmd

Progress:-
NULL
Starting copy...
NULL
6 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total     : 15     Average : (400.00 rows per sec.)
NULL

Check the file in the directory set above in @filename variable.

Now we will see how can we read the XML file saved above. Will store the XML in into a SQL Server XML variable
and query it with SELECT statement:

DECLARE @xmlStr XML

-- Storing XML records into a XML variable:

SELECT @xmlStr = BulkColumn
FROM OPENROWSET(BULK 'E:\SQL_Queries\test.xml', SINGLE_CLOB) AS X

The above statement gave me ERROR:
SINGLE_CLOB requires a double-byte character set (DBCS) (char) input file. The file specified is Unicode.
Because the XML BCP’d to the file above was saved in unicode format. So use SINGLE_NCLOB instead of SINGLE_CLOB.

SELECT @xmlStr = BulkColumn
FROM OPENROWSET(BULK 'E:\SQL_Queries\test.xml', SINGLE_NCLOB) AS X

This again gave me ERROR:
XML parsing: line 1, character 2034, illegal qualified name character
Because the file saved in xml file has CrLf (Carriage Return & Line Feed) in between.
So I removed the CrLf from the XML string as shown below.

SELECT @xmlStr = REPLACE(REPLACE(BulkColumn,CHAR(10),''),CHAR(13),'')
FROM OPENROWSET(BULK 'E:\SQL_Queries\test.xml', SINGLE_NCLOB) AS X

-- Querying the XML stored in @xmlStr variable with SELECT statement:
SELECT
	T.C.value('ContactID[1]', 'INT') as [ContactID],
	T.C.value('Title[1]', 'VARCHAR(8)') as [Title],
	T.C.value('FirstName[1]', 'VARCHAR(50)') as [FirstName],
	T.C.value('MiddleName[1]', 'VARCHAR(50)') as [MiddleName],
	T.C.value('LastName[1]', 'VARCHAR(50)') as [LastName],
	T.C.value('Suffix[1]', 'VARCHAR(10)') as [Suffix],
	T.C.value('EmailAddress[1]', 'VARCHAR(50)') as [EmailAddress],
	T.C.value('Phone[1]', 'VARCHAR(50)') as [Phone]
FROM @xmlStr.nodes('//Person.Contact/Contact') as T(C)

Categories: XML Tags: BCP, OPENROWSET, read XML, save XML, SINGLE_BLOB, SINGLE_CLOB, SINGLE_NCLOB, XML

Using DMLs with XMLs – INSERT (SQL Server)

October 17, 2011 1 comment

In my previous XML related posts I wrote about how can we query and SELECT XML strings & pull data from parent-child nodes and hierarchies. In this post we’ll see how we can modify that XML data by applying DMLs to it.

In this part let’s see how we can INSERT new records in a XML string:

DECLARE @xml XML

-- Creating sample XML string with single row:
SET @xml=N'&lt;Root&gt;
    &lt;stud id=&quot;1&quot;&gt;
		&lt;fname&gt;Manoj&lt;/fname&gt;
		&lt;lname&gt;Pandey&lt;/lname&gt;
		&lt;class&gt;10&lt;/class&gt;
		&lt;marks&gt;80.5&lt;/marks&gt;
	&lt;/stud&gt;
&lt;/Root&gt;';

-- Now check the sample row by querying it with SELECT statement:
SELECT
	Tab.Col.value('@id','int') AS ID,
	Tab.Col.value('fname[1]','nvarchar(20)') AS FirstName,
	Tab.Col.value('lname[1]','nvarchar(20)') AS LastName,
	Tab.Col.value('class[1]','int') AS class,
	Tab.Col.value('marks[1]','float') AS Marks
FROM @xml.nodes('/Root/stud') Tab(Col)

Output:-
ID FirstName LastName class Marks
1 Manoj Pandey 10 80.5
DMLs in XML provides a common function xml.modify() to modify the XML string in 3 ways, i.e. INSERT, UPDATE & DELETE. Here we will see how we can INSERT new records at different positions in the sample XML created above:

--// Now INSERTing some more records:

-- 1. INSERT (basic)
SET @xml.modify('
insert &lt;stud id=&quot;2&quot;&gt;
		&lt;fname&gt;Saurabh&lt;/fname&gt;
		&lt;lname&gt;Sharma&lt;/lname&gt;
		&lt;class&gt;11&lt;/class&gt;
		&lt;marks&gt;82.7&lt;/marks&gt;
	&lt;/stud&gt;
into (/Root)[1]');

-- 2. INSERT at beginning, first record
SET @xml.modify('
insert &lt;stud id=&quot;0&quot;&gt;
		&lt;fname&gt;Kanchan&lt;/fname&gt;
		&lt;lname&gt;Pandey&lt;/lname&gt;
		&lt;class&gt;10&lt;/class&gt;
		&lt;marks&gt;90.5&lt;/marks&gt;
	&lt;/stud&gt;
as first
into (/Root)[1]');

-- 3. INSERT at end, last record
SET @xml.modify('
insert &lt;stud id=&quot;3&quot;&gt;
		&lt;fname&gt;Rajesh&lt;/fname&gt;
		&lt;lname&gt;Shah&lt;/lname&gt;
		&lt;class&gt;11&lt;/class&gt;
		&lt;marks&gt;70.3&lt;/marks&gt;
	&lt;/stud&gt;
as last
into (/Root)[1]');

-- Let's check the records just inserted:
SELECT
	Tab.Col.value('@id','int') AS ID,
	Tab.Col.value('fname[1]','nvarchar(20)') AS FirstName,
	Tab.Col.value('lname[1]','nvarchar(20)') AS LastName,
	Tab.Col.value('class[1]','int') AS class,
	Tab.Col.value('marks[1]','float') AS Marks
FROM @xml.nodes('/Root/stud') Tab(Col)

Output:-
ID FirstName LastName class Marks
0 Kanchan Pandey 10 90.5
1 Manoj Pandey 10 80.5
2 Saurabh Sharma 11 82.7
3 Rajesh Shah 11 70.3

-- Check final XML with INSERTed records above:
SELECT @xml

In my next XML-DML post we will see UPDATE and DELETE operations.

Categories: XML Tags: insert as first, insert as last, XML, XML Insert

SQL Trivia – Identify & Delete Duplicate records from a table

October 14, 2011 6 comments

I see a lot of questions/posts about dealing with duplicate records in many SQL Server forums. Many of these questions are asked in SQL Server Interviews and many developers starting their carrier in database programming find it challenging to deal with. Here we will see how we can deal with such records.

Duplicate data is one of the biggest pain points in our IT industry, which various projects have to deal with. Whatever state-of-art technology and best practices followed, the big ERP, CRM, SCM and other inventory based database management projects ends up in having duplicate & redundant data. Duplicate data keeps on increasing by manual entries and automated data loads. Various data leads getting pumped into system’s databases without proper deduping & data cleansing leads to redundant data and thus duplicated record-sets.

Data cleansing requires regular exercise of identifying duplicates, validating and removing them. To minimize these type of scenarios various checks and filters should also be applied before loading new leads into the system.

–> Lets check this by a simple exercise how we can identify & remove duplicate data from a table:

1. Insert some sample records from Person.Contact table of [AdventureWorks] database:

USE [AdventureWorks]
GO

SELECT TOP 10 ContactID, FirstName, LastName, EmailAddress, Phone
INTO DupContacts
FROM Person.Contact

SELECT * FROM DupContacts

2. Insert some duplicate records from the same list inserted above:

INSERT INTO DupContacts
SELECT TOP 50 PERCENT FirstName, LastName, EmailAddress, Phone
from DupContacts

SELECT * FROM DupContacts

3. Insert some more duplicate records from the same list inserted above.

INSERT INTO DupContacts
SELECT TOP 20 PERCENT FirstName, LastName, EmailAddress, Phone
from DupContacts

SELECT * FROM DupContacts

–> Identify Duplicate records & delete them

Method #1: by using ROW_NUMBER() function:

;WITH dup as (
SELECT ContactID, FirstName, LastName, EmailAddress, Phone,
ROW_NUMBER() OVER(PARTITION BY FirstName, LastName ORDER BY ContactID) AS NumOfDups
FROM DupContacts)
SELECT * FROM dup
WHERE NumOfDups > 1
ORDER BY ContactID

-- Remove/Delete duplicate records:
;WITH dup as (
SELECT ContactID, FirstName, LastName, EmailAddress, Phone,
ROW_NUMBER() OVER(PARTITION BY FirstName, LastName ORDER BY ContactID) AS NumOfDups
FROM DupContacts)
DELETE FROM dup
WHERE NumOfDups > 1

SELECT * FROM DupContacts

Method #2: by using SELF-JOIN:

SELECT DISTINCT a.ContactID, a.FirstName, a.LastName, a.EmailAddress, a.Phone
FROM DupContacts a
JOIN DupContacts b
ON a.FirstName = b.FirstName
AND a.LastName = b.LastName
AND a.ContactID > b.ContactID

-- Remove/Delete duplicate records:
DELETE a
FROM DupContacts a
JOIN DupContacts b
ON a.FirstName = b.FirstName
AND a.LastName = b.LastName
AND a.ContactID > b.ContactID

SELECT * FROM DupContacts

Method #3: by using AGGREGATES & Sub-QUERY:

SELECT * FROM DupContacts
WHERE ContactID NOT IN (SELECT MIN(ContactID)
FROM DupContacts
GROUP BY FirstName, LastName)

-- Remove/Delete duplicate records:
DELETE FROM DupContacts
WHERE ContactID NOT IN (SELECT MIN(ContactID)
FROM DupContacts
GROUP BY FirstName, LastName)

SELECT * FROM DupContacts

–> Final Cleanup

DROP TABLE DupContacts

Check the same demo here:

Categories: SQL Tips, SQL Trivia Tags: Delete Duplicate, ROW_NUMBER, SELF JOIN

What are Records, Pages, Extents, Heap in SQL Server?

October 12, 2011 1 comment

I found this very interesting series by Paul Randal on some basic terms of SQL Server Storage Engine at MSDN SQL Server blog. This is a 4 part series and provides information on records, pages, extents and heap structures.

I’ve seen many people asking questions related to these topics on MSDN & other SQL Server forums. And there are very few tech articles on internet and MS BOL regarding these. Paul has very breifly and in layman terms explained these topics.

I’d like to share the following topics for my future reference and for my blog viewers:

Part 1. What are Rows/Records?

Part 2. What are Pages?

Part 3. What are Extents?

Part 4. What is a Heap?

–> PAGE: A page in SQL Server is an 8 KB data storage area. There are 8 types of pages:

1. Data page
2. Index page
3. IAM (Index Allocation Map) page
4. GAM (Global Allocation Map) and SGAM (Shared Global Allocation Map) page
5. PFS (Page Free Space) page
6. Text/Image page
7. Sort page
8. Differential Bitmap page
9. Bulk-Changed Map page
10. Boot page
11. FileHeader page