Archive for the ‘SQL Server Internals’ Category

DB Basics – How to control Data Redundancy in a database system (RDBMS) by Normalization

April 12, 2016 Leave a comment

Redundancy in Database systems occurs with various insert, update, and delete anomalies.

To avoid these anomalies in first step you need to make sure your database tables or relations are in good normal forms or normalized upto a certain level.

Normalization is a systematic way of ensuring that a database structure is suitable for general-purpose querying and free the anomalies discussed above. that could lead to a loss of data integrity.

Normal forms in a database or the concept of Normalization makes a Relation or Table free from insert/update/delete anomalies and saves space by removing duplicate data.

–> According to E. F. Codd the objectives of normalization were stated as follows:

1. To free the collection of relations from undesirable insertion, update and deletion dependencies.

2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs.

3. To make the relational model more informative to users.

4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by.

As of now there are total 8 normal forms, but to keep our data consistent & non-redundant the first 3 Normal Forms are sufficient.

–> Anomalies like: Let’s say you have a single table that stores Employee and Department details, thus:

1. Insert Anomaly: If you are inserting a detail of an Employee then his department detail will also be entered for every employee record, thus departments details will be repeated with multiple records, thus storing duplicate data for Departments.

2. Update Anomaly: While updating a department detail you have to update the same department for various employees, which may lead to inconsistent state if any record is left while updating or on any error.

3. Delete Anomaly: If a department is closed, then deleting department record will also delete the Employee records, thus missing records.

The process of normalization makes this EmployeeDepartment table to decompose or split into 2 or more tables and linked them by Foreign Keys, thus eliminating duplicate records, data redundancy and making data/records consistent across all relations/tables.

– 1st NF talks about atomic values and non-repeating groups.

– 2nd NF enforces that a non-Key attribute should belong to entire Key attribute.

– 3rd NF makes sure that there should be no transitive dependency between a non-Key and a Key attribute.

For details on these 3 NFs check my blog post on [Database Normalization].

Parse or Query XML column with XMLNAMESPACES (xmlns namespace) – MSDN TSQL forum

January 15, 2016 Leave a comment

–> Question:

We have SQL audit information.
We would like to select XML column in some user friendly way.

CREATE TABLE [dbo].[audit](
	[server_instance_name] [NVARCHAR](128) NULL,
	[statement] [NVARCHAR](4000) NULL,
	[additional_information] XML NULL

INSERT INTO [dbo].[audit]([server_instance_name],[statement],[additional_information]) 

INSERT INTO [dbo].[audit]([server_instance_name],[statement],[additional_information]) 
VALUES('srv2','','<action_info xmlns=""><session><![CDATA[Audit$A]]></session><action>event enabled</action><startup_type>manual</startup_type><object><![CDATA[audit_event]]></object></action_info>')

SELECT * FROM [dbo].[audit]

Output of the XML column:

XML parse

Required Output:

XML parse2


–> Answer:

	,t.c.value ('ns:session[1]', 'varchar(50)') AS session
	,t.c.value ('ns:action[1]', 'varchar(50)') AS action
	,t.c.value ('ns:startup_type[1]', 'varchar(50)') AS startup_type
	,t.c.value ('ns:object[1]', 'varchar(50)') AS object
FROM [audit] as a
OUTER APPLY a.additional_information.nodes('//ns:action_info') as t(c)

-- OR -- 

	,t.c.value ('session[1]', 'varchar(50)') AS session
	,t.c.value ('action[1]', 'varchar(50)') AS action
	,t.c.value ('startup_type[1]', 'varchar(50)') AS startup_type
	,t.c.value ('object[1]', 'varchar(50)') AS object
FROM [audit] as a
OUTER APPLY a.additional_information.nodes('//action_info') as t(c)


Drop table finally

DROP TABLE [audit]


Ref link.

Passing multiple/dynamic values to Stored Procedures & Functions | Part 5 – by passing JSON string

October 31, 2015 3 comments

This post is part of the [Passing multiple/dynamic values to Stored Procedures & Functions] series, and as well as the new feature Native JSON support in SQL Server 2016.

Adding the fifth part to this series we will use JSON string that will contain the set of values and pass as an JSON param variable to the SP. Then inside the SP we will parse this JSON and use those values in our SQL Queries, just like we did in previous posts with CSV/XML strings:

USE [AdventureWorks2014]

-- Create an SP with NVARCHAR type parameter for JSON string:
	@persons NVARCHAR(MAX)
	--SET @persons = '{"root":[{"Name":"Charles"},{"Name":"Jade"},{"Name":"Jim"},{"Name":"Luke"},{"Name":"Ken"}]}'
	INTO #tblPersons
	FROM OPENJSON (@persons, '$.root')
	WITH ( 
		Name NVARCHAR(100)

	FROM [Person].[Person] PER
		FROM #tblPersons tmp 
		WHERE tmp.Name  = PER.FirstName
	ORDER BY FirstName, LastName

	DROP TABLE #tblPersons

-- Create JSON string:
SET @json = N'{
  "root": [
    { "Name": "Charles" },
    { "Name": "Jade" },
    { "Name": "Jim" },
    { "Name": "Luke" },
    { "Name": "Ken" }

-- Use the JSON string as parameter which calling the SP:
EXEC uspGetPersonDetailsJSON @json

-- Check the output, objective achieved

-- Final Cleanup


Thus you can also use JSON string similar to the way you used XML string, to pass multiple and dynamic number of parameters to your Stored Procedures.

As JSON feature is new to SQL Server 2016, so this method will only work with SQL Server 2016 and above versions.

DB Basics – As Primary Key can identify a row uniquely, what are the scenarios you need a Super Key?

September 17, 2015 Leave a comment

In one of my [previous post] I discussed about various types of Keys in DBMS, like Candidate, Primary, Composite & Super Keys.

I got few questions like:
1. If there is already a Primary Key to identify a row uniquely, then in which scenario a Super Key can use used?
2. How to create a Super Key like a Primary Key in SQL Server or Oracle?

–> First of all what is a Primary Key (PK): It is the Candidate key attribute/column that is most suited to maintain uniqueness in a table at the tuple/row level.

–> And a Super Key is a super-set of Candidate key. If you add any other attribute/column to a Candidate Key then it become a Super Key. Or a minimal Super Key is called a Candidate Key.

–> Now consider the same example that I discussed in my [previous post]:

    DOB, -- Date of birth
    DOJ, -- Date of joining
    SSN, -- Social Security Number
    DeptID, -- Department ID
    MgrID -- Manager ID

Now in this table there are 2 Candidate Keys i.e. EmployeeID & SSN.
– EmployeeID: should be unique with auto-increment column.
– SSN: should also be unique for every employee.

So, here you cannot make SSN as a Primary Key, because its a PII and secure data.
Hence, creating EmployeeID as Primary Key makes sense, this way SSN will be left as an Alternate Key.

–> Now, coming to the Super Key concept: As per the above definition, a minimal Super Key will be called as a Candidate or Primary Key. Thus every Primary Key and Alternate Key is also a Super Key. And if you combine any other column with the EmployeeID (PK) column, the combination will also be called as a Super Key, like:
– EmployeeID
– EmployeeID + DOB
– EmployeeID + DOJ
– EmployeeID + DeptID + MgrID
– EmployeeID + DOB + DOJ + DeptID + MgrID
– EmployeeID + [other column combinations]

For all the above and other possible combinations the row will still be identified uniquely, thus all these combinations will act like a Super Key.

And, if you add all columns with the EmployeeID, it is called a Trivial Super Key, because it uses all columns of Employee table to identify a row.

And in SQL Server or Oracle there is no syntax as such to create a Super Key, this is just a concept in DBMS.

Hope this helps !!!

Categories: DB Concepts Tags: ,

SQL Basics – What are System databases | master, model, msdb, tempdb, resource

July 17, 2015 Leave a comment

While working on SQL Server Management Studio you might have seen some default databases in System Database folder. And you would have observed the same set of databases on other machines or database servers you might have worked on or seen, as shown in the image below (red-circled):

SQL System Databases

There are total 5 System Databases, out of these you can see 4 under the System Databases folder (above image, ignore the blue circled). The 5th database is Resource which does not appear on SSMS.

–> Let’s check about these databases individually here:

1. master: database records all the system-level information for an instance of SQL Server, which includes instance-wide metadata such as logon accounts, endpoints, linked servers, and system configuration settings.

It records the existence of all other databases and the location of those database files and records the initialization information for SQL Server.

SQL Server cannot start if the master database is unavailable. If in case you would like to move the master database or have moved it and ran into issues, check this post Move Master database.

Starting SQL Server 2005 and ahead, system objects are no longer stored in the master database, instead, they are stored in the Resource database, mentioned below (5th type).

2. model: database is used as the template for all databases created on an instance of SQL Server. The entire contents of the model database, including database options, are copied to the new database. Thus, if you modify the model database, all databases created afterward will inherit those changes.

Because tempdb is created every time SQL Server is started, the model database must always exist on a SQL Server system.

3. msdb: database is used by SQL Server Agent for scheduling alerts and jobs and by other features such as SSMS, Service Broker and Database Mail.

SQL Server automatically maintains a complete online backup-and-restore history within tables in msdb, which includes the name of the party that performed the backup, the time of the backup, and the devices or files where the backup is stored.

4. tempdb: database is a workspace for holding temporary objects or intermediate result sets.

tempdb is a global resource that is available to all users connected to the instance of SQL Server and is used to hold the following:

– Temporary user objects that are explicitly created, such as: global or local temporary tables, temporary stored procedures, table variables, or cursors.

– Internal objects that are created by the SQL Server Database Engine, for example, work tables to store intermediate results for spools or sorting.

– Row versions that are generated by data modification transactions in a database that uses read-committed using row versioning isolation or snapshot isolation transactions.

– Row versions that are generated by data modification transactions for features, such as: online index operations, Multiple Active Result Sets (MARS), and AFTER triggers.

tempdb is re-created every time SQL Server is started so that the system always starts with a clean copy of the database.

Temporary tables and stored procedures are dropped automatically on disconnect, and no connections are active when the system is shut down. Thus, there is never anything in tempdb to be saved from one session of SQL Server to another.

5. Resource: database does not appear on SSMS, and is a read-only database that contains all the system objects that are included with SQL Server.

SQL Server system objects, such as sys.objects, are physically persisted in the Resource database, but they logically appear in the sys schema of every database. The Resource database does not contain user data or user metadata.

The ID of the Resource database is always 32767.

The physical file names of the Resource database are mssqlsystemresource.mdf and mssqlsystemresource.ldf, and are located in :\Program Files\Microsoft SQL Server\MSSQL.\MSSQL\Binn\ folder.

Check the Resource Database version and the last updated date:

SELECT SERVERPROPERTY('ResourceVersion') AS ResourceVersion;
SELECT SERVERPROPERTY('ResourceLastUpdateDateTime') AS ResourceLastUpdateDateTime;

SQL System Databases Resource

Check my other post on Resource Database.

–> You can also query all these system databases, except the Resource database.

SQL System Databases files

And you can see above first 4 database IDs, 1-4 are reserved for master, tempdb, model and msdb databases.

–> There are two more databases created while you install SQL Server with Reporting Services (SSRS). You can see these databases appearing in SSMS in the image above (circled blue):

1. reportServer: stores following:

– Items managed by a report server (reports and linked reports, shared data sources, report models, folders, resources) and all of the properties and security settings that are associated with those items.

– Subscription and schedule definitions.

– Report snapshots (which include query results) and report history.

– System properties and system-level security settings.

– Report execution log data.

– Symmetric keys and encrypted connection and credentials for report data sources.

2. reportServerTempdb: Each report server database uses a related temporary database to store session and execution data, cached reports, and work tables that are generated by the report server. Background server processes will periodically remove older and unused items from the tables in the temporary database.