Advertisements
Watch & Subscribe my SQL videos on YouTube | Join me on Facebook

Get started with Python on SQL Server – Run Python with T-SQL on SSMS

August 10, 2018 Leave a comment

 
With SQL Server 2016 Microsoft added Machine Learning support with R Language in SQL engine itself and called it SQL Server R Services.

Going ahead with the new SQL Server 2017 version Microsoft added Python too as part of Machine Learning with existing R Language, and thus renamed it to SQL Server Machine Learning Services.
 

Installation/Setup

Here are few steps to get you started with Python programming in SQL Server, so that you can run Python scripts with T-SQL scripts within SSMS:
 

1. Feature Selection: While installing SQL Server 2017 make sure you’ve selected below highlighted services


 

2. Configure “external scripts enabled”: Post installation run below SQL statements to enable this option

sp_configure 'external scripts enabled'
GO
sp_configure 'external scripts enabled', 1;
GO
RECONFIGURE; 
GO
sp_configure 'external scripts enabled'
GO


 

3. Restart SQL Server service: by “services.msc” program from command prompt, and run below SQL statement, this should show run _value = 1

sp_configure 'external scripts enabled'
GO


 

If still you don’t see run _value = 1, then try restarting the Launchpad service mentioned below in Step #4.

4. Launchpad Service: Make sure this service is running, in “services.msc” program from command prompt. Restart the service, and it should be in Running state.

Its a service to launch Advanced Analytics Extensions Launchpad process that enables integration with Microsoft R Open using standard T-SQL statements. Disabling this service will make Advanced Analytics features of SQL Server unavailable.

Post restarting this service, it should return run _value = 1 on running the query mentioned at Step #3
 

Run Python from SSMS

So as you’ve installed SQL Server with ML services with Python & R, and enabled the components, now you can try running simple “Hello World!” program to test it:

EXEC sp_execute_external_script 
	@language = N'Python', 
	@script = N'print(''Hello Python !!! from T-SQL'')'


 

Let’s do simple math here:

EXEC sp_execute_external_script 
@language = N'Python', 
@script = N'
x = 5
y = 6
a = x * y
b = x + y
print(a)
print(b)'

 

If you still face issues you can go through addition configuration steps mentioned in [MSDN Docs link].

 


Advertisements

Azure Databricks learning resources (documentation and videos)

August 7, 2018 1 comment

 

Databricks Introduction

What is Azure Databricks [Video]

Create Databricks workspace with Apache Spark cluster

Extract, Transform & Load (ETL) with Databricks

– Documentation:
   – Azure
   – Databricks
 

From Channel 9

1. Data Science using Azure Databricks and Apache Spark [Video]

2. Data ingestion, stream processing and sentiment analysis using Twitter [Video]

3. ETL with Azure Databricks using ADF [Video]

4. ADF new features & integration with Azure Databricks [Video]

5. Azure Databricks introduces R Studio Integration [Video]

6. Run Jars and Python scripts on Azure Databricks using ADF [Video]
 

From Microsoft Build Conf


 


Azure Databricks (a fully managed Apache Spark offering)

July 28, 2018 Leave a comment

 

Databricks Introduction:

Azure Databricks = Best of Databricks + Best of Azure

Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform (PaaS).

It is a fast, easy-to-use, and collaborative Apache Spark–based analytics platform. Designed in collaboration with the creators of Apache Spark, it combines the best of Databricks and Azure to help you accelerate innovation with one-click set up, streamlined workflows, and an interactive workspace that enables collaboration among data scientists, data engineers, and business analysts. Because it’s an Azure service, you benefit from native integrations with other Azure services such as Power BI, SQL Data Warehouse, and Cosmos DB. You also get enterprise-grade Azure security, including Active Directory integration, compliance, and enterprise-grade SLAs.
 


 
–> With Databricks you can:
– Launch your new Spark environment with a single click.
– Integrate effortlessly with a wide variety of data stores.
– Use Databricks Notebooks to unify your processes and instantly deploy to production.
– Improve and scale your analytics with a high-performance processing engine optimized for the comprehensive, trusted Azure platform.
 

Learning Resources:

    Webinar recording on Azure Databricks

    My next blog in series
 


Azure SQL Data Sync – keep your data in sync between Azure Hybrid environment SQL Server and SQL DB

July 12, 2018 Leave a comment

 
SQL Data Sync is a service built on Azure SQL Database that lets you synchronize the data you select bi-directionally across multiple SQL databases and SQL Server instances.
 

Internal Mechanism & Performance impact:

Data Sync uses insert, update, and delete triggers to track changes. It creates side tables in the user database for change tracking. These change tracking activities have an impact on your database workload. Assess your service tier and upgrade if needed.

Since Data Sync is trigger-based, transactional consistency is not guaranteed. Microsoft guarantees that all changes are made eventually, and that Data Sync does not cause data loss.
 

Limitations & Requirements:

1. Each table must have a primary key
2. Snapshot isolation must be enabled
3. A table cannot have an identity column that is not the primary key
4. The names of objects (databases, tables, and columns) cannot contain the printable characters period (.), left square bracket ([), or right square bracket (]).
5. Unsupported data types
    a. FileStream
    b. SQL/CLR UDT
    c. XMLSchemaCollection (XML supported)
    d. Cursor, Timestamp, Hierarchyid
6. Azure Active Directory authentication is not supported.

Maximum number of sync groups any database can belong to 5
Maximum number of endpoints in a single sync group 30
Maximum number of on-premises endpoints in a single sync group 5
Database, table, schema, and column names 50 characters per name
Tables in a sync group 500
Columns in a table in a sync group 1000
Data row size on a table 24 Mb
Minimum sync interval 5 Minutes

 

Configurations:

1. Frequency can be set in seconds, minutes, hours and days (min 5 minutes)
2. You can choose and select desired tables and columns to sync
3. For on premises database you must configure a local DMG agent
 

Setup Azure SQL Data Sync:

Check here for Step-by-Step tutorial.

1. Create a Hub database on SQL DB (Hub & Spoke topology)
    a. Hub database: must be Azure SQL DB
    b. Spoke/Member database: rest of databases are either Azure SQL DB, on SQL Server instance

2. Create Sync group (On the Hub Database create a “New Sync Group”)
    a. Sync Schema: which data is being synchronized
    a. Sync metadata database: must be an Azure SQL DB
    b. Sync Interval: frequency
    c. Conflict Resolution Policy: (Hub wins or Member wins)

3. Add Sync Members (Spokes, can be either SQL DB or SQL instance)
    a. Sync Agent Gateway (for on-prem): download and install on on-premise server
    b. Sync Direction: bi-directional, or one direction

4. Configure Sync group
    a. Select the Tables/Columns which you want to sync.
 

Powershell script to:

Sync SQL DB & on-prem SQL Server instance
Sync between multiple SQL Databases
 

Best Practices: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-best-practices-data-sync

Monitor: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-sync-monitor-oms

Troubleshoot: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-troubleshoot-data-sync
 

Recommendations:

Data Sync is not appropriate for the following scenarios:
1. Disaster Recovery
2. Read Scale
3. ETL (OLTP to OLAP)
4. Migration from on-premises SQL Server to Azure SQL Database
 

Further reading: https://blogs.msdn.microsoft.com/igorpag/2017/07/06/azure-sql-data-sync-test-drive-and-first-impressions/


All about Blockchain – Introduction

June 25, 2018 2 comments

 
Blockchain is a disruptive technology that is going to transform many industries in near future. Bitcoin is one of its implementation. Let’s check what is Blockchain is all about.
 

What is Blockchain?

A Blockchain is an immutable, secure, digital, distributed ledger without a central authority, which uses public/private signature technology to validate and record transactions in near real-time. It is nothing but a data structure which works as decentralized database or a distributed ledger that stores a registry of assets and transactions over a peer-to-peer network of computers.

Blockchain’s distributed digital ledger contains cryptographically signed transactions that are grouped into blocks. Each Transaction recorded in the database is digitally signed and mathematically guaranteed to be authentic and resistant to fraud. Each block is cryptographically linked to the previous one after validation and undergoing a Proof of Work or Consensus decision. As new blocks are added, older blocks become more difficult to modify, thus immutable. New blocks are replicated across all copies of the ledger within the network/nodes, and any conflicts are resolved automatically using established rules.

 

Blockchain Internals:

1. Blockchain uses a Distributed Ledger to track transactions

2. A Ledger is a Immutable (append only) database most commonly used in accounting

3. Same copy of the data distributed across all the participating Nodes (Decentralized)

4. All new transactions are securely encrypted and then broadcast across the Blockchain network to be added to the system

5. Participants in the Blockchain verify the transaction is valid and then writes it to the Ledger

6. Transactions are grouped together in Blocks. Blocks are linked to previous Blocks, which make the blockchain.

7. The Transaction chain tracks how ownership changes, while the Block-Chain tracks the order of transactions.
 

Example: how Blockchain works?

1. A person X transfers $100 to person Y. Both persons X & Y have their Account numbers and corresponding Private Keys.

2. A Transaction record is created which contains the transaction details and digital signatures from both persons/parties.

2. The Transaction is broadcasted to the network for verification by the various computer nodes to make sure if the transaction details are valid.

3. On successful validation of Transaction is accepted in the network and added to a Block. Each block contains a unique code called a Hash, it also contains the hash of the previous block in the chain.

4. The verified Block is added to the Blockchain. The Hash codes connect the blocks together in a specific order.

… However this does not look as simple as mentioned above. There goes lot of computations and verifications to make sure what transactions are valid and to add them to the Blockchain.
 

Blockchain vs Traditional Ledgers:

– Traditional Ledgers are centralized and thus requires third Party authority and middlemen to authenticate, approve and record Transactions.

– But Blockchain safely distributes ledger across the entire network and does not require any middlemen.

Blockchain implementations:

– Eliminates Intermediaries: Allows industries to redefine or create new business models.

– Reduces Fraud: Highly secure and transparent, making it nearly impossible to change historical records.

– Increases Efficiency and Speed: Simplifies transactions and enables T+Zero settlement time.

– Increases Revenue and Savings: Potential savings and new revenue opportunities through more efficient processes and reduced costs.
 

Blockchain Use Cases:

1. Ownerships
    – Land registries
    – Property titles
    – Other physical assets

2. Identities
    – Blockchain e-identities to citizens
    – Use services like voting
    – Healthcare records

3. Verification
    – Licenses
    – Proofs of records (degrees, grades, etc)
    – Transactions
    – Processes or events

4. Movement of assets
    – Transferring money from one person/entity to another.
    – Enabling direct payments, once a work condition has been performed.
 

Blockchain implementations:

1. Bitcoin or BTC serves as the cryptographically secured unit of value, numeraire (standard for currency exchange) and currency in the case of the Bitcoin protocol and hybrid fuel/currency used as a Cryptocurrency.

2. Ethereum or ETH serves as the cryptographically secured unit of value, numeraire and hybrid fuel/currency for the Ethereum protocol.

3. Ripple, LiteCoin, etc.
 

Blockchain Network Types:

1. Public:
– Many unknown participants, like Banks, Traders, Financial firms etc.
– Writes by all participants
– Reads by all participants
– Consensus by Proof of Work

2. Private:
– Known participation from one organization, like a Bank
– Write permission centralized
– Reads may be public or restricted
– multiple algorithms for consensus

3. Consortium:
– Known participation from multiple organization
– Write requires consensus from several participants
– Reads may be public or restricted
– multiple algorithms for consensus
 

Microsoft Blockchain offering:

Microsoft has a Blockchain offering on Microsoft Azure as Ethereum Blockchain as a Service (EBaaS) so Enterprise clients and developers can have a single click, low cost, ready-made, cloud based blockchain dev/test/production environment.
 

We will see more on Blockchain in next posts !!!