Indexes | SQL with Manoj

SQL Server 2017 – ColumnStore Index enhancements and improvements over previous versions

December 21, 2017 1 comment

ColumnStore Indexes were first introduced in SQL Server 2012, and this created a new way to store and retrieve the Index or Table data in an efficient manner.

What is a ColumnStore Index?

What all new features & enhancements done in ColumnStore Index from SQL Server 2012 to 2014 and 2016?

–> What’s new in SQL Server 2017?

1. Online Non-Clustered ColumnStore index build and rebuild support added

2. Clustered Columnstore Indexes now support LOB columns (nvarchar(max), varchar(max), varbinary(max))

3. Columnstore index can have a non-persisted computed columns

4. The -fc option in Database Tuning Advisor (DTA) for allowing recommendations of ColumnStore indexes

–> Video on ColumnStore Index:

Categories: Indexes, SQL Server 2017 Tags: ColumnStore Index, SQL Server 2017

SQL Error – Cannot create index on view ‘vw_SomeView’ because the view is not schema bound.

October 5, 2015 Leave a comment

This error occurs when you try to create an Index on top of a View which is not created by using “WITH SCHEMABINDING” option.

When you create Index on top of a View they are called “Indexed View” or “Materialized View”.

–> Let’s check below:

use [AdventureWorks2014]
go

DROP VIEW IF EXISTS vw_Person
GO

CREATE VIEW vw_Person
AS
SELECT 
	[BusinessEntityID], 
	[FirstName], [MiddleName], [LastName], 
	[Suffix], 
	[EmailPromotion], 
	[ModifiedDate]
FROM [Person].[Person]
GO

CREATE UNIQUE CLUSTERED INDEX CIX_vw_Person ON [vw_Person]([BusinessEntityID])
GO

Msg 1939, Level 16, State 1, Line 18
Cannot create index on view ‘vw_Person’ because the view is not schema bound.

–> So to fix this issue you can ALTER the view definition by adding “WITH SCHEMABINDING” option with ALTER/CREATE VIEW statement as shown below.

Thus, creating an Index on a Schema bound view will not throw this error.

ALTER VIEW vw_Person
WITH SCHEMABINDING --<< here <<
AS
SELECT 
	[BusinessEntityID], 
	[FirstName], [MiddleName], [LastName], 
	[Suffix], 
	[EmailPromotion], 
	[ModifiedDate]
FROM [Person].[Person]
GO

CREATE UNIQUE CLUSTERED INDEX CIX_vw_Person ON [vw_Person]([BusinessEntityID])
GO

–> By adding this rule or restriction SQL Server just wants to make sure if an Index is created on a View, nobody can directly change the definition of Table(s) underneath the View. And if it needs a change then first the View need to be dropped and re-created after altering the tables (error shown below).

ALTER [Person].[Person]
DROP COLUMN [Suffix]

Msg 5074, Level 16, State 1, Line 32
The object ‘vw_Person’ is dependent on column ‘suffix’.
Msg 4922, Level 16, State 9, Line 32
ALTER TABLE DROP COLUMN suffix failed because one or more objects access this column.

Categories: Indexes, SQL Errors Tags: Indexed View, Materialized View, SQL Views

ColumnStore Indexes evolution from SQL Server 2012, 2014 to 2016

June 8, 2015 4 comments

ColumnStore Indexes were first introduced in SQL Server 2012, and this created a new way to store and retrieve the Index or Table data in an efficient manner.

ColumnStore uses xVelocity technology that was based on Vertipaq engine, this uses a new Columnar storage technique to store data that is highly Compressed and is capable of In-memory Caching and highly parallel data scanning with Aggregation algorithms.

ColumnStore or Columnar data format does not store data in traditional RowStore fashion, instead the data is grouped and stored as one column at a time in Column Segments.

The Traditional RowStore stores data for each row and then joins all the rows and store them in Data Pages, and is still the same storage mechanism for Heap and Clustered Indexes.

Thus, a ColumnStore increases the performance by reading less data that is highly compressed, and in batches from disk to memory by further reducing the I/O.

–> To know more about ColumnStore Indexes check [MSDN BoL].

–> Let’s see the journey of ColumnStore index from SQL Server 2012 to 2016:

✔ We will start with SQL Server 2012 offering for ColumnStore Indexes:

1. A Table (Heap or BTree) can have only one NonClustered ColumnStore Index.

2. A Table with NonClustered ColumnStore Index becomes readonly and cannot be updated.

3. The NonClustered ColumnStore Index uses Segment Compression for high compression of Columnar data.

✔ With SQL Sever 2014 some new features were added, like:

1. You can create one Clustered ColumnStore Index on a Table, and no further Indexes can be created.

2. A Table with Clustered ColumnStore Index can be updated with INSERT/UPDATE/DELETE operations.

3. Both Clustered & NonClustered ColumnStore Index has new Archival Compression options i.e. COLUMNSTORE_ARCHIVE to further compress the data.

✔ Now with SQL Server 2016 new features have been added and some limitations are removed:

1. A Table will still have one NonClustered ColumnStore Index, but this will be updatable and thus the Table also.

2. Now you can create a Filtered NonClustered ColumnStore Index by specifying a WHERE clause to it.

3. A Clustered ColumnStore Index table can have one or more NonClustered RowStore Indexes.

4. Clustered ColumnStore Index can now be Unique indirectly as you can create a Primary Key Constraint on a Heap table (and also Foreign Key constraints).

5. Columnar Support for In-Memory (Hekaton) tables, as you can create one one ColumnStore Index on top of them.

–> Here is the full feature summary of ColumnStore Indexe evolution from SQL Server 2012 to 2016:

–> Check more about this on [MSDN BoL].

Check the above details explained in the video below:

Categories: Indexes, SQL Server 2016 Tags: Clustered Columnstore Index, ColumnStore, ColumnStore Index, NonClustered columnstore index, SQL Server 2016

Index Usage Stats – Indexes Used, Unused and Updated

April 2, 2015 1 comment

While working on Performance Optimization and Index tuning many Developers ends up in creating some (or even many) unnecessary Indexes based upon various SQL Queries. Some or many of those Indexes might not be used at all by the SQL Query Optimizer. But even if they are used you may have to Trade-off with your ETL performance going down, as the CRUD (CREATE, READ, UPDATE, DELETE) operations are going to taking more time to update those new Indexes.

So, while creating new Indexes you will have to plan very carefully, decide and balance out things so that your Data retrieval is fast and on the same hand your ETLs are also not affected much.

SQL Server provides some DMVs (Dynamic Management Views) and DMFs (Dynamic Management Functions) to get this information from SQL engine.

–> To know how the Indexes are getting used we can use sys.dm_db_index_usage_stats DMV and this will provide us information on how many times the Index was used for SEEK, SCAN & LOOKUP operations. Check the Query and its output below:

SELECT 
	DB_NAME(database_id) AS DATABASE_NAME,
	OBJECT_NAME(ius.object_id) AS TABLE_NAME, 
	ids.name AS INDEX_NAME, 
	ius.user_seeks AS SEEK_COUNT, 
	ius.user_scans AS SCAN_COUNT, 
	ius.user_lookups AS LOOKUP_COUNT, 
	ius.user_seeks + ius.user_scans + ius.user_lookups AS TOTAL_USAGE, 
	ius.last_user_seek AS LAST_SEEK_COUNT, 
	ius.last_user_scan AS LAST_SCAN_COUNT, 
	ius.last_user_lookup AS LAST_LOOKUP_COUNT
FROM sys.dm_db_index_usage_stats AS ius
INNER JOIN sys.indexes AS ids
ON ids.object_id = ius.object_id
AND ids.index_id = ius.index_id 
WHERE OBJECTPROPERTY(ius.object_id,'IsUserTable') = 1 
ORDER BY DATABASE_NAME, TOTAL_USAGE

–> Now if we want to know the maintenance overhead on the new Indexes we created, like every time a related Table is updated the Indexes are also updated. We can check by using sys.dm_db_index_operational_stats DMF and it will show how many INSERT, UPDATE & DELETE operations are happening on particular indexes. Check the Query and its output below:

SELECT 
	DB_NAME(database_id) AS DATABASE_NAME,
	OBJECT_NAME(ios.object_id) AS TABLE_NAME, 
	idx.name AS INDEX_NAME, 
	ios.leaf_insert_count AS INSERT_COUNT, 
	ios.leaf_update_count AS UPDATE_COUNT, 
	ios.leaf_delete_count AS DELETE_COUNT, 
	ios.leaf_insert_count + ios.leaf_update_count + ios.leaf_delete_count AS TOTAL_COUNT 
FROM sys.dm_db_index_operational_stats (NULL,NULL,NULL,NULL ) ios
INNER JOIN sys.indexes AS idx
ON idx.object_id = ios.object_id
AND idx.index_id = ios.index_id 
WHERE  OBJECTPROPERTY(ios.object_id,'IsUserTable') = 1
ORDER BY DATABASE_NAME, TOTAL_COUNT

–> So, by using these two Dynamic Management Views/Functions you can know:

1. What is the affect (maintenance overhead) of Indexes you created, and

2. Are the Indexes really used or not so that you can DROP them.

Please Note: that these stats could be wrong at times because of several reasons, like:

1. If SQL Server (MSSQLSERVER) service is re-started these counters are initialized to 0.

2. When a Database is detached or is shut down (for example, because AUTO_CLOSE is set to ON), all rows associated with the database are removed.

3. Index Rebuild resets these counters to 0 (Bug logged in MS Connect, link).

Categories: Indexes, Optimization Performance Tags: Index Scan, Index Seek, Index Usage Stats, Unused Indexes, Used Indexes

Memory Optimized Indexes | Hash vs Range Indexes – SQL Server 2014

December 20, 2013 3 comments

In SQL Server 2014 for In-Memory tables there are lot of changes in DDLs compared with normal Disk Based Tables. In-Memory Tables related changes we’ve seen in previous posts, check [here]. Here we will see Memory Optimized Index related changes and few important things to take care before designing your Tables and Indexes.

–> Some of the main points to note are:

1. Indexes on In-Memory tables must be created inline with CREATE TABLE DDL script only.

2. These Indexes are not persisted on Disk and reside only in memory, thus they are not logged. As these Indexes are not persistent so they are re-created whenever SQL Server is restarted. Thus In-Memory tables DO NOT support Clustered Indexes.

3. Only two types of Indexes can be created on In-Memory tables, i.e. Non Clustered Hash Index and Non Clustered Index (aka Range Index). So there is no bookmark lookup.

4. These Non Clustered Indexes are inherently Covering, and all columns are automatically INCLUDED in the Index.

5. Total there can be MAX 8 Non Clustered Indexes created on an In-Memory table.

–> Here we will see how Query Optimizer uses Hash & Range Indexes to process query and return results:

1. Hash Indexes: are used for Point Lookups or Seeks. Are optimized for index seeks on equality predicates and also support full index scans. Thus these will only perform better when the predicate clause contains only equality predicate (=).

2. Range Indexes: are used for Range Scans and Ordered Scans. Are optimized for index scans on inequality predicates, such as greater than or less than, as well as sort order. Thus these will only preform better when the predicate clause contains only inequality predicates (>, <, =, BETWEEN).

–> Let’s check this by some hands-on code. We will create 2 similar In-Memory tables, one with Range Index and another with Hash Index:

-- Create In-Memory Table with simple NonClustered Index (a.k.a Range Index):
CREATE TABLE dbo.MemOptTable_With_NC_Range_Index
(
    ID INT NOT NULL
        PRIMARY KEY NONCLUSTERED,
    VarString VARCHAR(200) NOT NULL, 
    DateAdded DATETIME NOT NULL
) WITH (
	MEMORY_OPTIMIZED = ON, 
	DURABILITY = SCHEMA_AND_DATA
)
GO

-- Create In-Memory Table with NonClustered Hash Index:
CREATE TABLE dbo.MemOptTable_With_NC_Hash_Index
(
    ID INT NOT NULL
        PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 10000),
    VarString VARCHAR(200) NOT NULL, 
    DateAdded DATETIME NOT NULL
) WITH (
	MEMORY_OPTIMIZED = ON, 
	DURABILITY = SCHEMA_AND_DATA
)
GO

–> Now we will Insert about 10k records on both the tables, so that we have good numbers of records to compare:

DECLARE @i INT = 1

WHILE @i <= 10000
BEGIN
	INSERT INTO dbo.MemOptTable_With_NC_Range_Index 
		VALUES(@i, REPLICATE('a', 200), GETDATE())

	INSERT INTO dbo.MemOptTable_With_NC_Hash_Index 
		VALUES(@i, REPLICATE('a', 200), GETDATE())

	SET @i = @i+1
END

–> Now check the Execution Plan by using equality Operator (=) on both the tables:

SELECT * FROM MemOptTable_With_NC_Hash_Index  WHERE ID = 5000 -- 4%
SELECT * FROM MemOptTable_With_NC_Range_Index WHERE ID = 5000 -- 96%

You will see in the Execution Plan image below that Equality Operator with Hash Index Costs you only 4%, but Range Index Costs you 96%.

–> Now check the Execution Plan by using inequality Operator (BETWEEN) on both the tables:

SELECT * FROM MemOptTable_With_NC_Hash_Index  WHERE ID BETWEEN 5000 AND 6000 -- 99%
SELECT * FROM MemOptTable_With_NC_Range_Index WHERE ID BETWEEN 5000 AND 6000 -- 1%

You will see in the Execution Plan image below that Inequality Operator with Range Index Costs you only 1%, but Hash Index Costs you 99%.

So, while designing In-Memory Tables and Memory Optimized Indexes you will need to see in future that how you will be going to query that table. It also depends upon various scenarios and conditions, so always keep note of these things in advance while designing your In-Memory Tables.

Update: Know more about In-Memory tables: