Microsoft Azure Data Platform – September (2017) update

October 1, 2017 Leave a comment

Microsoft Azure updates curated from September-2017 on: Azure SQL DB/DW, HD Insight, Machine Learning (ML), Azure Data Lake (ADL), etc.

1. Run Hortonworks clusters and easily access Azure Data Lake

2. Microsoft at PostgresOpen 2017

3. Bot conversation history with Azure Cosmos DB

4. Azure Stream Analytics drives retail industry transformation with real-time insights

5. Try Azure CosmosDB for free

6. Using Azure Analysis Services with Azure Data Lake Store

7. Ask us anything about the new Azure Log Analytics language

8. Azure HDInsight training resources – Learn about big data using open source technologies

9. September updates to the Azure Analysis Services web designer

10. Azure Analysis Services now available in Azure Government

11. Diving deep into what’s new with Azure Machine Learning

12. Azure CosmosDB – database for serverless era

13. Announcing tools for the AI-driven digital transformation

14. Introducing SQL Vulnerability Assessment for Azure SQL Database and on-premises SQL Server!

15. Azure Data Factory – announcing new capabilities in public preview

16. Get insights into your Azure CosmosDB: partition heatmaps, OMS, and more

17. Run you Hive LLAP & PySpark Job in Visual Studio Code

18. Azure SQL Database and Data Warehouse VNET Service Endpoints public preview

19. General availability of HDInsight Interactive Query – blazing fast queries on hyper-scale data

20. Azure Log Analytics – meet our new query language

21. Paxata launches Self-Service Data Preparation on Azure HDInsight to accelerate Data Prep

22. Introducing HDInsight integration with Azure Log Analytics Preview

23. Azure Analysis Services adds firewall support

24. Monitoring Azure SQL Data Sync using OMS Log Analytics


Microsoft Azure Data Platform – August (2017) update

September 1, 2017 Leave a comment

Microsoft Azure updates curated from August-2017 on: Azure SQL DB/DW, HD Insight, Machine Learning (ML), Azure Data Lake (ADL), etc.

1. Teradata Bolsters Analytics and Database capabilities for Azure

2. Migrating a Web App from ClearDB to Azure Database for MySQL

3. Online training for Azure Data Lake (ADL)

4. Introducing the #Azure CosmosDB Change Feed Processor Library

5. August updates to the Azure Analysis Services web designer

6. Bring Interactive Analytics to Azure HDInsight: Kyligence Analytics Platform enables sub-second query

7. Azure AD (AAD) authentication extensions for Azure SQL DB and SQL DW tools

8. Data Management Gateway (DMG) – High Availability and Scalability Preview

9. Using Azure Analysis Services over Azure SQL DB and DW

10. Azure Data Factory (ADF) July new features update

11. Replicated tables now in preview for Azure SQL Data Warehouse

12. On-premises data gateway support for Azure Analysis Services

13. Imanis Data – Cloud migration, backup for your big data apps on Azure HDInsight

14. Azure Database for MySQL and Azure Database for PostgreSQL availability in India

15. Perform advanced analytics on Application Insights data using Jupyter Notebook

16. Announcing the public preview of Azure Archive Blob Storage and Blob-Level Tiering

17. Azure Analysis Services web designer adds visual model editing to the preview

18. Debug Spark Code Running in Azure HDInsight from Your Desktop

19. Hortonworks extends IaaS offering on Azure with Cloudbreak

20. Announcing Azure Data Lake Store Capture Provider for Event Hubs Capture

21. Default compatibility level 140 for Azure SQL databases

22. Preview: SQL Transparent Data Encryption (TDE) with Bring Your Own Key support

23. Announcing Azure Blob storage events preview

24. Stream Processing Changes: Azure CosmosDB change feed + Apache Spark

25. Machine Learning based anomaly detection in Azure Stream Analytics

26. Announcing Default Encryption for Azure Blobs, Files, Table and Queue Storage

27. Automation of Azure Analysis Services with Service Principals and PowerShell

Sample 14 Interview Questions and Answers for Hadoop Administration Certified Professional

August 21, 2017 2 comments

Despite plenty of opportunities for Hadoop professionals, getting a good job may seem tedious. This is because cracking the Hadoop Admin Interview is a challenge and you must prepare for it to get a good job. At Koenig Solutions, candidates not only acquire Hadoop administration certification, but also get to prepare for the interview to start a challenging yet lucrative career.

–> This article enlists 14 important questions and answers commonly asked during Hadoop Administration jobs interviews:

Q1. What daemons are required to run a Hadoop cluster?
A. DataNode, NameNode, JobTracker and TaskTracker are required for the process.

Q2. How would you restart a NameNode?
A. The easiest way – click on (to run the command to stop running shell script). After this, click to restart the NameNode.

Q3. What are different schedulers available in Hadoop?
A. a. COSHH: Considers the workload, cluster and the user heterogeneity for scheduling decisions.
    b. FIFO Scheduler: Doesn’t consider heterogeneity, but orders the job on the basis of arrival time in queue.
    c. Fair Sharing: Defines a pool for each user. Users can use their own pools to execute the job.

Q4. What Hadoop shell commands can be used to perform copy operation?
A. fs –copyToLocal
    fs –put
    fs –copyFromLocal.

Q5. What’s the purpose of jps command?
A. It is used to confirm whether the daemons running Hadoop cluster are working or not. The output of jps command reveals the status of DataNode, NameNode, Secondary NameNode, JobTracker and TaskTracker.

Q6. How many NameNodes can be run on single Hadoop cluster?
A. Only one.

Q7. What will happen when the NameNode on the Hadoop cluster is down?
A. Whenever the NameNode is down, the file system goes offline.

Q8. Detail crucial hardware considerations when deploying Hadoop in product environment.
A. Operating System: 64-bit operating system
    Capacity: Larger form factor (3.5”) disks allow more storage and costs less.
    Network: Two TOR switches per rack for better redundancy.
    Storage: To achieve high performance and scalability, it is better to design a Hadoop platform by moving the compute activity to data.
    Memory: System’s memory requirements vary based on the application.
    Computational Capacity: Can be determined by the total count of MapReduce slots existing across nodes within a Hadoop cluster.

Q9. Which command will you use to determine if the HDFS (Hadoop Distributed File System) is corrupt?
A. Hadoop FSCK (File System Check) command.

Q10. How a Hadoop job can be killed?
A. using command: Hadoop job –kill jobID.

Q11. Can filed be copied across multiple clusters? If yes, how?
A. Yes, it is possible using distributed copy. DistCP command can be used for intra or inter cluster copying.

Q12. Recommend the best Operating System to run Hadoop.
A. Ubuntu or Linux is the best. Although Windows can be used, it can lead to several problems.

Q13. How often the NameNode should be reformatted?
A. Never, as it can lead to complete data loss. It is formatted only once, in the beginning.

Q14. What are Hadoop configuration files and where are they located?
A. Hadoop has 3 different configuration files – mapred-site.xml, hdfs-site.xml, and core-site.xml – which are located in “conf” sub directory.

Checkout – Best Free Resources For Sharpening Your Skills In Hadoop. These are just a few questions, but you may come across several others, depending on your Hadoop

Author Bio: Michael Warne is a tech blogger and an expert in Hadoop certification training. He has an experience of 5 years in the Hadoop professionals industry, and has worked as a certified Hadoop for top-notch IT companies.

Microsoft Azure Data Platform – July (2017) update

August 1, 2017 Leave a comment

SQL Server 2017 Release Candidate (RC1, full & final version) is available for download

On 17th July 2017 Microsoft released the full & final Release Candidate 1 (RC1) version of SQL Server 2017.

As announced earlier with the first CTP release, the new SQL Server 2017 will run both on Windows & Linux. Not only Linux, but it will be supported on Docker, and macOS (via Docker) too.

–> Download SQL Server 2017 bits:

To download the SQL Server 2017 you can Register and Download the Full version or Free evaluation version (180 days).

Or, directly download the ISO (~1.7 GB): SQLServer2017RC1-x64-ENU.iso

–> Check version and SQL build:

select @@version

Microsoft SQL Server 2017 (RC1) – 14.0.800.90 (X64)
Jul 11 2017 07:03:16
Copyright (C) 2017 Microsoft Corporation. All rights reserved.
Enterprise Evaluation Edition (64-bit) on Windows 10 Enterprise 10.0 (Build 14393: ) (Hypervisor)


–> New Features & Enhancements: This Release Candidate version is the final version of SQL Server 2017 and adds following features:

1. SQL Server on Linux supports Active Directory Authentication, which enables domain-joined clients on either Windows or Linux to authenticate to SQL Server using their domain credentials and the Kerberos protocol.

2. SQL Server on Linux can use TLS to encrypt data that is transmitted across a network between a client application and an instance of SQL Server.

3. Added more model management capabilities for R Services on Windows Server, including External Library Management. The new release also supports Native Scoring.

4. Additional DMVs, enabling dependency analysis and reporting (DISCOVER_CALC_DEPENDENCY, MDSCHEMA_MEASUREGROUP_DIMENSIONS).

5. Added support for SSIS scale out in HA environments, customers can now enable Always On for SSIS.

6. Features already rolled out in previous CTP versions:

– All new features added in SQL Server 2016 SP1, [link].

– New features added in SQL Server 2017 CTP 1.x, [link].

– New features added in SQL Server 2017 CTP 2.x, [link].

–> Videos on SQL Server 2017

Download & Install SQL Server 2017 & SSMS on Windows

Install SQL Server on Linux

–> References:

>> SQL Server 2017 official Page

>> Docs for SQL Server 2017