SQL Server Internals | SQL with Manoj

Export XML column data to a file.xml

April 13, 2015 5 comments

There are times when you want to export XML data stored in Table’s columns to xml files for some purpose.

Here is a small hands on script that uses BCP command with QueryOut option to export data from SQL Server to file system:

USE [AdventureWorks2012]
GO

-- Save XML records to a file:
DECLARE @fileName VARCHAR(50)

DECLARE @sqlStr VARCHAR(1000)
DECLARE @sqlCmd VARCHAR(1000)

SET @fileName = 'D:\SQL_Queries\PersonAdditionalContactInfo.xml'
SET @sqlStr = 'select TOP 1 AdditionalContactInfo from AdventureWorks2012.Person.Person where AdditionalContactInfo IS NOT NULL'

SET @sqlCmd = 'bcp "' + @sqlStr + '" queryout ' + @fileName + ' -w -T'

EXEC xp_cmdshell @sqlCmd

Output:

NULL
Starting copy...
NULL
1 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total     : 140    Average : (7.14 rows per sec.)
NULL

If you have to extract XMLs stored in all the rows then you can iterate through all the rows one by one by using a WHILE loop or a CURSOR and extract data in separate files. You can provide unique names to your files dynamically by tweaking the above code.

Categories: XML Tags: BCP, BCP QUERYOUT, Export XML, QueryOut, XML

Index Usage Stats – Indexes Used, Unused and Updated

April 2, 2015 1 comment

While working on Performance Optimization and Index tuning many Developers ends up in creating some (or even many) unnecessary Indexes based upon various SQL Queries. Some or many of those Indexes might not be used at all by the SQL Query Optimizer. But even if they are used you may have to Trade-off with your ETL performance going down, as the CRUD (CREATE, READ, UPDATE, DELETE) operations are going to taking more time to update those new Indexes.

So, while creating new Indexes you will have to plan very carefully, decide and balance out things so that your Data retrieval is fast and on the same hand your ETLs are also not affected much.

SQL Server provides some DMVs (Dynamic Management Views) and DMFs (Dynamic Management Functions) to get this information from SQL engine.

–> To know how the Indexes are getting used we can use sys.dm_db_index_usage_stats DMV and this will provide us information on how many times the Index was used for SEEK, SCAN & LOOKUP operations. Check the Query and its output below:

SELECT 
	DB_NAME(database_id) AS DATABASE_NAME,
	OBJECT_NAME(ius.object_id) AS TABLE_NAME, 
	ids.name AS INDEX_NAME, 
	ius.user_seeks AS SEEK_COUNT, 
	ius.user_scans AS SCAN_COUNT, 
	ius.user_lookups AS LOOKUP_COUNT, 
	ius.user_seeks + ius.user_scans + ius.user_lookups AS TOTAL_USAGE, 
	ius.last_user_seek AS LAST_SEEK_COUNT, 
	ius.last_user_scan AS LAST_SCAN_COUNT, 
	ius.last_user_lookup AS LAST_LOOKUP_COUNT
FROM sys.dm_db_index_usage_stats AS ius
INNER JOIN sys.indexes AS ids
ON ids.object_id = ius.object_id
AND ids.index_id = ius.index_id 
WHERE OBJECTPROPERTY(ius.object_id,'IsUserTable') = 1 
ORDER BY DATABASE_NAME, TOTAL_USAGE

–> Now if we want to know the maintenance overhead on the new Indexes we created, like every time a related Table is updated the Indexes are also updated. We can check by using sys.dm_db_index_operational_stats DMF and it will show how many INSERT, UPDATE & DELETE operations are happening on particular indexes. Check the Query and its output below:

SELECT 
	DB_NAME(database_id) AS DATABASE_NAME,
	OBJECT_NAME(ios.object_id) AS TABLE_NAME, 
	idx.name AS INDEX_NAME, 
	ios.leaf_insert_count AS INSERT_COUNT, 
	ios.leaf_update_count AS UPDATE_COUNT, 
	ios.leaf_delete_count AS DELETE_COUNT, 
	ios.leaf_insert_count + ios.leaf_update_count + ios.leaf_delete_count AS TOTAL_COUNT 
FROM sys.dm_db_index_operational_stats (NULL,NULL,NULL,NULL ) ios
INNER JOIN sys.indexes AS idx
ON idx.object_id = ios.object_id
AND idx.index_id = ios.index_id 
WHERE  OBJECTPROPERTY(ios.object_id,'IsUserTable') = 1
ORDER BY DATABASE_NAME, TOTAL_COUNT

–> So, by using these two Dynamic Management Views/Functions you can know:

1. What is the affect (maintenance overhead) of Indexes you created, and

2. Are the Indexes really used or not so that you can DROP them.

Please Note: that these stats could be wrong at times because of several reasons, like:

1. If SQL Server (MSSQLSERVER) service is re-started these counters are initialized to 0.

2. When a Database is detached or is shut down (for example, because AUTO_CLOSE is set to ON), all rows associated with the database are removed.

3. Index Rebuild resets these counters to 0 (Bug logged in MS Connect, link).

Categories: Indexes, Optimization Performance Tags: Index Scan, Index Seek, Index Usage Stats, Unused Indexes, Used Indexes

DB Basics – What are ACID properties of a Transaction in an RDBMS?

September 28, 2014 3 comments

In order to perform a Transaction in a database system and to make sure it works without any issues, there are few rules a Database Transaction should follow. These rules are the standards across all Relational Database systems (RDBMS) and are called ACID rules.

ACID stands for Atomicity, Consistency, Isolation and Durability. So let’s check what all these Rules states.

–> A: Atomicity states that every Transaction should be atomic in nature. A Transaction in a Relational Database can contain either a single SQL statement or multiple SQL statements. Thus by Atomic Transaction it means “all or none”. Either all SQL statements/steps execute successfully in a transaction, or fail as a single unit and none of them should be treated as executed and the system should be returned to its original state.

For example: If account-A & account-B both having $2000 balance, you have to transfer $1000 from account-A to account-B, this will involves 2 steps. First withdrawal from account-A, and Second deposit in account-B. Thus, both the steps should be treated as single or atomic unit and at the end account-A should have $1000 & account-B should have $3000 balance. If in case after First step the system fails or any error occurs then first step should also be rolled-back and $1000 withdrawn from account-A should be re-deposited to it, maintaining $2000 back in both the accounts. Thus there should be no intermediate state where account-A has $1000 and account-B still has $2000 balance.

–> C: Consistency states that any Transaction happened in a database will take it from one consistent state to another consistent state. The data finally recorded in the database must be valid according to the defined Rules, Constraints, Cascades, Triggers, etc. If in case of any failure to these rules the changes made by any transaction should be rolled-back, this will put the system in earlier consistent state.

For example: If the money deposit process has any Trigger built on top of it. And at the time of money transfer any of the Trigger fails or any database node, the system should automatically Rollback the complete transaction and switch back the system to its previous consistent state before the transaction was started. Or if everything executes successfully then the system is committed to a new consistent state.

–> I: Isolation means Transactions performing same functions should run in Isolation and not in parallel to provide more concurrency to the data and avoiding dirty reads & writes. One need to use proper Transaction Isolation levels and locking in order to prevent this.

For example: If two people accessing a joint-account with $5000 balance from 2 terminals to withdraw money. Let’s say at same time John & Marry apply to withdraw $4000 from two different ATMs. If both the Transactions do not run in Isolation and run in parallel then both John & Marry will be able to withdraw $4000 each i.e. $8000 total from their account. To make sure this won’t happen Transactions should be not allowed to run in parallel, by setting Transaction Isolations and/or locking methods on the database objects.

–> D: Durability, a transaction should be durable by storing the data permanently and making it available in case of power failure, recovery from system failure, crash, any error, etc. All in all, the data should not get lost in any of the miss-happenings and one should be able to recover data from restore, logging and other methods.

Categories: DB Concepts Tags: ACID Database, ACID Properties, ACID Transaction, Atomicity, Consistency, DURABILITY, Isolation, RDBMS, Relational Database, Transaction

DB Basics – What are Candidate, Primary, Composite & Super Keys and Difference between them?

September 15, 2014 7 comments

Let’s take an example of an Employee table:

CREATE TABLE Employee (
    EmployeeID, 
    EmployeeName, 
    DOB, -- Date of birth
    DOJ, -- Date of joining
    SSN, -- Social Security Number
    DeptID, -- Department ID
    MgrID -- Manager ID
)

1. Candidate Key: is the attribute/column or a set of attributes/columns in a relation/table that qualifies for uniqueness of each tuple/row. A relation/table can have one or more than one Candidate Keys. A Candidate key is also known as a minimal Super key.

Here in Employee table columns EmployeeID & SSN individually can maintain uniqueness in a table, thus are eligible for Candidate keys. The columns EmployeeName + DOB combined can also make up a Candidate Key, but there is a narrow chance that 2 Employees with same name can be born in same day.

2. Primary Key: is the Candidate key attribute/column that is most suited to maintain uniqueness in a table at the tuple/row level. More about PK.

Here in Employee table you can choose either EmployeeID or SSN column for a PK, EmployeeID is preferable choice because SSN is a secure (PII) value.

3. Alternate Key: are the other Candidate key attribute/columns that you didn’t choose as Primary key column.

Like if you choose EmployeeID as a PK then SSN would be the Alternate key.

4. Super Key: is a superset of Candidate key. If you add any other attribute/column to a Candidate Key then it become a Super Key.

Like EmployeeID + EmployeeName is a Super Key.

5. Composite Key: If a table do have a single column that qualifies for a Candidate key, then you have to select 2 or more columns to make a row unique.

Like if there is no EmployeeID or SSN columns in Employee table, then you can make EmployeeName + DOB as a Composite Primary Key. But still there can be a narrow chance of duplicate rows.

–> Check the video:

Categories: DB Concepts Tags: Alternate Key, Candidate Key, Composite Key, PRIMARY KEY, Super Key

How to convert ‘20140620’ date to ‘DD-MM-YYYY’ format – MSDN TSQL forum

June 12, 2014 Leave a comment

–> Question:

I have data for date fields like “20140620”(YYYYMMDD) while inserting into the table it should be in the format “DD-MM-YYYY”

Can any one please help me to achive this.

–> My Response:

SELECT FORMAT(CAST('20140620' as DATE), 'dd-MM-yyyy', 'en-US'), CAST('20140620' as DATE)

… this will work with SQL Server 2012 and above.

–> Other Responses:

-- by Olaf Helper
SELECT CONVERT(varchar(10), CONVERT(datetime, '20140620', 112), 105) AS DDMMYYY

-- by Latheesh NK
SELECT CONVERT(VARCHAR(10), '20140620', 105)

Ref Link.

Categories: Datatypes, SQL Server Questions Tags: Convert Date, MSDN TSQL forum

Newer Entries Older Entries