Archive
Memory Optimized Indexes | Hash vs Range Indexes – SQL Server 2014
In SQL Server 2014 for In-Memory tables there are lot of changes in DDLs compared with normal Disk Based Tables. In-Memory Tables related changes we’ve seen in previous posts, check [here]. Here we will see Memory Optimized Index related changes and few important things to take care before designing your Tables and Indexes.
–> Some of the main points to note are:
1. Indexes on In-Memory tables must be created inline with CREATE TABLE DDL script only.
2. These Indexes are not persisted on Disk and reside only in memory, thus they are not logged. As these Indexes are not persistent so they are re-created whenever SQL Server is restarted. Thus In-Memory tables DO NOT support Clustered Indexes.
3. Only two types of Indexes can be created on In-Memory tables, i.e. Non Clustered Hash Index and Non Clustered Index (aka Range Index). So there is no bookmark lookup.
4. These Non Clustered Indexes are inherently Covering, and all columns are automatically INCLUDED in the Index.
5. Total there can be MAX 8 Non Clustered Indexes created on an In-Memory table.
–> Here we will see how Query Optimizer uses Hash & Range Indexes to process query and return results:
1. Hash Indexes: are used for Point Lookups or Seeks. Are optimized for index seeks on equality predicates and also support full index scans. Thus these will only perform better when the predicate clause contains only equality predicate (=).
2. Range Indexes: are used for Range Scans and Ordered Scans. Are optimized for index scans on inequality predicates, such as greater than or less than, as well as sort order. Thus these will only preform better when the predicate clause contains only inequality predicates (>, <, =, BETWEEN).
–> Let’s check this by some hands-on code. We will create 2 similar In-Memory tables, one with Range Index and another with Hash Index:
-- Create In-Memory Table with simple NonClustered Index (a.k.a Range Index):
CREATE TABLE dbo.MemOptTable_With_NC_Range_Index
(
ID INT NOT NULL
PRIMARY KEY NONCLUSTERED,
VarString VARCHAR(200) NOT NULL,
DateAdded DATETIME NOT NULL
) WITH (
MEMORY_OPTIMIZED = ON,
DURABILITY = SCHEMA_AND_DATA
)
GO
-- Create In-Memory Table with NonClustered Hash Index:
CREATE TABLE dbo.MemOptTable_With_NC_Hash_Index
(
ID INT NOT NULL
PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 10000),
VarString VARCHAR(200) NOT NULL,
DateAdded DATETIME NOT NULL
) WITH (
MEMORY_OPTIMIZED = ON,
DURABILITY = SCHEMA_AND_DATA
)
GO
–> Now we will Insert about 10k records on both the tables, so that we have good numbers of records to compare:
DECLARE @i INT = 1
WHILE @i <= 10000
BEGIN
INSERT INTO dbo.MemOptTable_With_NC_Range_Index
VALUES(@i, REPLICATE('a', 200), GETDATE())
INSERT INTO dbo.MemOptTable_With_NC_Hash_Index
VALUES(@i, REPLICATE('a', 200), GETDATE())
SET @i = @i+1
END
–> Now check the Execution Plan by using equality Operator (=) on both the tables:
SELECT * FROM MemOptTable_With_NC_Hash_Index WHERE ID = 5000 -- 4% SELECT * FROM MemOptTable_With_NC_Range_Index WHERE ID = 5000 -- 96%
You will see in the Execution Plan image below that Equality Operator with Hash Index Costs you only 4%, but Range Index Costs you 96%.
–> Now check the Execution Plan by using inequality Operator (BETWEEN) on both the tables:
SELECT * FROM MemOptTable_With_NC_Hash_Index WHERE ID BETWEEN 5000 AND 6000 -- 99% SELECT * FROM MemOptTable_With_NC_Range_Index WHERE ID BETWEEN 5000 AND 6000 -- 1%
You will see in the Execution Plan image below that Inequality Operator with Range Index Costs you only 1%, but Hash Index Costs you 99%.
So, while designing In-Memory Tables and Memory Optimized Indexes you will need to see in future that how you will be going to query that table. It also depends upon various scenarios and conditions, so always keep note of these things in advance while designing your In-Memory Tables.
Update: Know more about In-Memory tables:
XTP (eXtreme Transaction Processing) with Hekaton Tables & Native Compiled Stored Procedures – SQL Server 2014
In my previous posts [this & this] I talked about creating Memory Optimized Database, how to create In-Memory Tables & Native Compiled Stored Procedures and what happens when they are created.
Here in this post we will see how FAST actually In-Memory tables & Native Compiled Stored Procedures are, when compared with normal Disk based Tables & Simple Stored Procedures.
I’ll be using the same [ManTest] database used in my previous posts, you can refer to the DDL script [here].
–> We will create:
1. One Disk based Table & one simple Stored Procedure which will use this Disk based Table.
2. One In-Memory Table & one Native Compiled Stored Procedure which will use this In-Memory Table.
1. Let’s first create a Disk based Table and a normal Stored Procedure:
USE [ManTest]
GO
-- Create a Disk table (non-Memory Optimized):
CREATE TABLE dbo.DiskTable
(
ID INT NOT NULL
PRIMARY KEY,
VarString VARCHAR(200) NOT NULL,
DateAdded DATETIME NOT NULL
)
GO
-- Create normal Stored Procedure to load data into above Table:
CREATE PROCEDURE dbo.spLoadDiskTable @maxRows INT, @VarString VARCHAR(200)
AS
BEGIN
SET NOCOUNT ON
DECLARE @i INT = 1
WHILE @i <= @maxRows
BEGIN
INSERT INTO dbo.DiskTable VALUES(@i, @VarString, GETDATE())
SET @i = @i+1
END
END
GO
2. Now create an In-Memory table & a Native Compiled Stored Procedure to load data:
-- Create an In-Memory table:
CREATE TABLE dbo.MemOptTable
(
ID INT NOT NULL
PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 10000),
VarString VARCHAR(200) NOT NULL,
DateAdded DATETIME NOT NULL
) WITH (
MEMORY_OPTIMIZED = ON,
DURABILITY = SCHEMA_AND_DATA
)
GO
-- Create Native Compiled Stored Procedure to load data into above Table:
CREATE PROCEDURE dbo.spLoadMemOptTable @maxRows INT, @VarString VARCHAR(200)
WITH
NATIVE_COMPILATION,
SCHEMABINDING,
EXECUTE AS OWNER
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL=SNAPSHOT, LANGUAGE='us_english')
DECLARE @i INT = 1
WHILE @i <= @maxRows
BEGIN
INSERT INTO dbo.MemOptTable VALUES(@i, @VarString, GETDATE())
SET @i = @i+1
END
END
GO
–> Now we will try to Load 10k record in above 2 table in various ways, as follows:
1. Load Disk based Table by T-SQL script using a WHILE loop.
2. Load the same Disk based Table by Stored Procedure which internally uses a WHILE loop.
3. Load In-Memory Table by T-SQL script using a WHILE loop.
4. Load the same In-Memory Table by Native Compiled Stored Procedure which internally uses a WHILE loop.
–> Working with Disk based Tables:
SET NOCOUNT ON
DECLARE
@StartTime DATETIME2,
@TotalTime INT
DECLARE
@i INT,
@maxRows INT,
@VarString VARCHAR(200)
SET @maxRows = 10000
SET @VarString = REPLICATE('a',200)
SET @StartTime = SYSDATETIME()
SET @i = 1
-- 1. Load Disk Table (without SP):
WHILE @i <= @maxRows
BEGIN
INSERT INTO dbo.DiskTable VALUES(@i, @VarString, GETDATE())
SET @i = @i+1
END
SET @TotalTime = DATEDIFF(ms,@StartTime,SYSDATETIME())
SELECT 'Disk Table Load: ' + CAST(@TotalTime AS VARCHAR) + ' ms (without SP)'
-- 2. Load Disk Table (with simple SP):
DELETE FROM dbo.DiskTable
SET @StartTime = SYSDATETIME()
EXEC spLoadDiskTable @maxRows, @VarString
SET @TotalTime = DATEDIFF(ms,@StartTime,SYSDATETIME())
SELECT 'Disk Table Load: ' + CAST(@TotalTime AS VARCHAR) + ' ms (with simple SP)'
–> Working with In-Memory Tables:
-- 3. Load Memory Optimized Table (without SP): SET @StartTime = SYSDATETIME() SET @i = 1 WHILE @i <= @maxRows BEGIN INSERT INTO dbo.MemOptTable VALUES(@i, @VarString, GETDATE()) SET @i = @i+1 END SET @TotalTime = DATEDIFF(ms,@StartTime,SYSDATETIME()) SELECT 'Memory Table Load: ' + CAST(@TotalTime AS VARCHAR) + ' ms (without SP)' -- 4. Load Memory Optimized Table (with Native Compiled SP): DELETE FROM dbo.MemOptTable SET @StartTime = SYSDATETIME() EXEC spLoadMemOptTable @maxRows, @VarString SET @TotalTime = DATEDIFF(ms,@StartTime,SYSDATETIME()) SELECT 'Disk Table Load: ' + CAST(@TotalTime AS VARCHAR) + ' ms (with Native Compiled SP)' GO
–> Output (Loaded 10k records):
Disk based Table Load : 28382 ms (without SP) Disk based Table SP Load : 8297 ms (with simple SP) In-Memory Table Load : 5176 ms (without SP) In-Memory Table SP Load : 174 ms (with Native Compiled SP)
So, you can clearly see the benefit and multifold increase in performance by using In-Memory Tables & Native Compiled Stored Procedures. The graph below shows performance in visual bar charts, impressive, isn’t it?
–> Final Cleanup
DROP PROCEDURE dbo.spLoadDiskTable DROP TABLE dbo.DiskTable DROP PROCEDURE dbo.spLoadMemOptTable DROP TABLE dbo.MemOptTable GO
Update: Know more about In-Memory tables:
Preparing for 70-463 Exam : Implementing a Data Warehouse with Microsoft SQL Server 2012
After passing 70-461 exam last year I’m now preparing for 70-463 exam i.e. Implementing a Data Warehouse with Microsoft SQL Server 2012. This exam mainly focuses on implementing Data Warehouses with Dimension & Fact tables, working with SSIS packages and Data Quality solutions. Thus being a Database Developer (primarily) this exam will be bit tough for me as I’ve not worked with Dimensional modelling, SSIS and DQS.
For preparation I’m referring to Training Kit (Exam 70-463) Implementing a Data Warehouse with Microsoft SQL Server 2012 book.
This exam is primarily intended for ETL and Data Warehouse (DW) developers who create Business Intelligence (BI) solutions, and whose responsibilities include Data Cleansing, and Extract Transform Load and Data Warehouse implementation.
–> The Exam is divided into 5 modules:
1. Design and Implement a Data Warehouse | 11%
– Introducing Star and Snowflake Schemas
– Design and Implement Dimensions
– Design and Implement Fact tables
– Managing the Performance of a Data Warehouse
– PREP links: Introduction to dimensions (Analysis Services – multidimensional data) | Dimension relationships | Columnstore indexes
2. Extract and Transform data | 23%
– Define Connection managers
– Design Data Flow
– Implement Data Flow with Transformations
– Control Flow tasks and Containers, Precedence Constraints
– Manage SSIS package execution
– Implement Script tasks in SSIS
– PREP links: Integration Services (SSIS) connections | Data flow | Slowly changing dimension transformation
3. Load Data | 27%
– Design control flow
– Implement package logic by using SSIS variables and parameter
– Implement Control flow
– Implement Data Load options
– Implement Script components in SSIS
– Slowly Changing Dimensions
– Preparing a Package for Incremental Load
– Package Transactions, Checkpoints, Event Handlers
– PREP links: Integration Services transactions | Developing a custom task | Integration Services (SSIS) parameters
4. Configure and Seploy SSIS solutions | 24%
– Troubleshoot Data Integration issues
– Install and Maintain SSIS components
– Implement Auditing, Logging, and Event handling
– Deploy SSIS solutions
– Configure SSIS security settings
– Data Mining Task and Transformation
– Preparing Data for Data Mining
– Implementing SSIS Fuzzy Transformations
– PREP links: Troubleshooting tools for package development | Load-balancing packages on remote servers by using SQL Server Agent | Integration Services (SSIS) logging
5. Build Data Quality Solutions (DQS) | 15%
– Install and maintain Data Quality services
– Creating and Maintaining a Knowledge Base
– Create a Data Quality project to clean data, Profiling Data and Improving Data Quality
– Using DQS and the DQS Cleansing Transformation
– Implement Master Data Management (MDM) solutions
– Using Master Data Services Add-in for Excel
– PREP links: Install Data Quality Services | Install Master Data Services | Master Data Services features and tasks
-–> You can visit following Study materials to prepare for this Exam:
Microsoft Link for this Certification: http://www.microsoft.com/learning/en-us/exam-70-463.aspx
Book on AMAZON.com: Training Kit (Exam 70-463) Implementing a Data Warehouse with Microsoft SQL Server 2012
I will try to update this blog post or will put up a new post with my learning while preparing for this exam.
All The Best!!!
SQL Server 2012 Certification Path
It’s always good to give Certifications, it enhances your technical skills and prove your knowledge, and more over it looks good on your Resume!
Exactly last year I passed the “Querying Microsoft SQL Server 2012” (70-461) exam, and I blogged my experience [here]. Since then I get lot of hits on the post from all over the world. Then only I came to know that there are many people who want to pursue for this exam and other follow up exams in sequence.
People ask me about study materials, dumps (which I don’t recommend strongly). Few people have confusion to give which exam in what order. Some are not aware of the exams that lies in the Certification Path.
So, I have created a visual snapshot of all these exams for all SQL Server 2012 Certification exams at different levels:
So, you can start with any of the 3 exams at the bottom (first) level based upon your area of interest. A Dev can take 70-461, a DBA can start with 70-462, and a DataWarehouse Engineer can go with 70-463, and you can take them in any order.
1. Microsoft Certified Professional: As soon as you pass any one exam you are an MCP.
2. Microsoft Certified Solution Associate: After you are done with all three (70-461 + 70-462 + 70-463) you are an MCSA.
3. Microsoft Certified Solution Expert: After achieving MCSA, you can either go for MCSE in Data Platform (70-464 + 70-465) or Business Intelligence (70-466 + 70-467).
4. Microsoft Certified Solution Master: After achieving MCSA in Data Platform, you can go even further to MCSM by taking 70-468 & 70-469 exams.
–> Study material for:
– Exam 70-461: Training Kit (Exam 70-461): Querying Microsoft SQL Server 2012 | My Blog Post on 70-461.
– Exam 70-462: Training Kit (Exam 70-462): Administering Microsoft SQL Server 2012 Databases | My Blog Post on 70-462
– Exam 70-463: Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft SQL Server 2012 | My Blog Post on 70-463
– Exam 70-464: Channel9 Video | Instructor-led – Developing Microsoft SQL Server 2012 Databases
– Exam 70-465: Channel9 Video | Instructor-led – Designing Database Solutions for Microsoft SQL Server 2012
– Exam 70-466: Channel9 Video | Instructor-led – Implementing Data Models and Reports with Microsoft SQL Server 2012
– Exam 70-467: Channel9 Video | Instructor-led – Designing Business Intelligence Solutions with Microsoft SQL Server 2012
– Exam 70-986: Not yet available
– Exam 70-987: Not yet available
–> UPDATE:
– MCSE (DP: 464/465 and BI: 466/467) exams are updated with SQL 2014 topics.
– MCSA (461/462/463) exams will be having SQL 2012 content only.
For more details about the Certification Path and exams you can check Microsoft Official site: http://www.microsoft.com/learning/en-us/sql-certification.aspx
–> Download “SQL Server 2014 Full or Express version for practice:
CTP-2 released for SQL Server 2014 | and I’ve installed it !
Much awaited Community Test Preview 2 (CTP-2) for SQL Server 2014 is released and you can Download it from [here].
Check out the Release Notes [here]. This lists some limitations, issues and workarounds for them.
–> As mentioned in my [previous post] for CTP-1:
– You cannot upgrade your existing installation of CTP-1 to CTP-2.
– and similar to CTP-1 restrictions you cannot install CTP-2 with pre-existing versions of SQL Server, SSDT and Visual Studio.
Sp, this should also be a clean install to be used only for learning and POCs, and should not be used on Production Environments. Installation is very simple and similar to CTP-1 and latest SQL Server previous versions.
–> What’s new with CTP-2:
1. Can create Range Indexes for Ordered Scans (along with Hash Indexes in CTP-1).
2. Configure the In-memory usage limit to provide performance and stability for the traditional disk-based workloads.
3. Memory Optimization Advisor wizard added to SSMS for converting disk-based Tables to In-memory (Hekaton) Tables, by identifying Incompatible Data Types, Identity Columns, Constraints, Partitioning, Replications, etc.
4. Similar to above a “Naive Compilation Advisor” wizard for converting Stored Procedures to Natively Compiled SPs, by identifying Incompatible SQL statements, like: SET Options, UDFs, CTE, UNION, DISTINCT, One-part names, IN Clause, Subquery, TVFs, GOTO, ERROR_NUMBER, INSERT EXEC, OBJECT_ID, CASE, SELECT INTO, @@rowcount, QUOTENAME, EXECUTE, PRINT, EXISTS, MERGE, etc.
5. and many more enhancements with Always On like: allowing to view XEvents in UTC time, triggering XEvents when replicas change synchronization state, and recording the last time and transaction LSN committed when a replica goes to resolving state, new wizard to greatly simplify adding a replica on Azure.
Enough for now, let me go back and work with CTP-2, wait for more updates !!!









