Advertisements

Sample 14 Interview Questions and Answers for Hadoop Administration Certified Professional

August 21, 2017 Leave a comment

 
Despite plenty of opportunities for Hadoop professionals, getting a good job may seem tedious. This is because cracking the Hadoop Admin Interview is a challenge and you must prepare for it to get a good job. At Koenig Solutions, candidates not only acquire Hadoop administration certification, but also get to prepare for the interview to start a challenging yet lucrative career.
 

–> This article enlists 14 important questions and answers commonly asked during Hadoop Administration jobs interviews:
 

Q1. What daemons are required to run a Hadoop cluster?
A. DataNode, NameNode, JobTracker and TaskTracker are required for the process.
 

Q2. How would you restart a NameNode?
A. The easiest way – click on stop-all.sh (to run the command to stop running shell script). After this, click start-all.sh to restart the NameNode.
 

Q3. What are different schedulers available in Hadoop?
A. a. COSHH: Considers the workload, cluster and the user heterogeneity for scheduling decisions.
    b. FIFO Scheduler: Doesn’t consider heterogeneity, but orders the job on the basis of arrival time in queue.
    c. Fair Sharing: Defines a pool for each user. Users can use their own pools to execute the job.
 

Q4. What Hadoop shell commands can be used to perform copy operation?
A. fs –copyToLocal
    fs –put
    fs –copyFromLocal.
 

Q5. What’s the purpose of jps command?
A. It is used to confirm whether the daemons running Hadoop cluster are working or not. The output of jps command reveals the status of DataNode, NameNode, Secondary NameNode, JobTracker and TaskTracker.
 

Q6. How many NameNodes can be run on single Hadoop cluster?
A. Only one.
 

Q7. What will happen when the NameNode on the Hadoop cluster is down?
A. Whenever the NameNode is down, the file system goes offline.
 

Q8. Detail crucial hardware considerations when deploying Hadoop in product environment.
A. Operating System: 64-bit operating system
    Capacity: Larger form factor (3.5”) disks allow more storage and costs less.
    Network: Two TOR switches per rack for better redundancy.
    Storage: To achieve high performance and scalability, it is better to design a Hadoop platform by moving the compute activity to data.
    Memory: System’s memory requirements vary based on the application.
    Computational Capacity: Can be determined by the total count of MapReduce slots existing across nodes within a Hadoop cluster.
 

Q9. Which command will you use to determine if the HDFS (Hadoop Distributed File System) is corrupt?
A. Hadoop FSCK (File System Check) command.
 

Q10. How a Hadoop job can be killed?
A. using command: Hadoop job –kill jobID.
 

Q11. Can filed be copied across multiple clusters? If yes, how?
A. Yes, it is possible using distributed copy. DistCP command can be used for intra or inter cluster copying.
 

Q12. Recommend the best Operating System to run Hadoop.
A. Ubuntu or Linux is the best. Although Windows can be used, it can lead to several problems.
 

Q13. How often the NameNode should be reformatted?
A. Never, as it can lead to complete data loss. It is formatted only once, in the beginning.
 

Q14. What are Hadoop configuration files and where are they located?
A. Hadoop has 3 different configuration files – mapred-site.xml, hdfs-site.xml, and core-site.xml – which are located in “conf” sub directory.
 

Checkout – Best Free Resources For Sharpening Your Skills In Hadoop. These are just a few questions, but you may come across several others, depending on your Hadoop
training.


 
Author Bio: Michael Warne is a tech blogger and an expert in Hadoop certification training. He has an experience of 5 years in the Hadoop professionals industry, and has worked as a certified Hadoop for top-notch IT companies.


Advertisements

SQL Server 2017 Release Candidate (RC1, full & final version) is available for download


 
On 17th July 2017 Microsoft released the full & final Release Candidate 1 (RC1) version of SQL Server 2017.

As announced earlier with the first CTP release, the new SQL Server 2017 will run both on Windows & Linux. Not only Linux, but it will be supported on Docker, and macOS (via Docker) too.
 

–> Download SQL Server 2017 bits:

To download the SQL Server 2017 you can Register and Download the Full version or Free evaluation version (180 days).

Or, directly download the ISO (~1.7 GB): SQLServer2017RC1-x64-ENU.iso
 

–> Check version and SQL build:

select @@version

Microsoft SQL Server 2017 (RC1) – 14.0.800.90 (X64)
Jul 11 2017 07:03:16
Copyright (C) 2017 Microsoft Corporation. All rights reserved.
Enterprise Evaluation Edition (64-bit) on Windows 10 Enterprise 10.0 (Build 14393: ) (Hypervisor)

 

–> New Features & Enhancements: This Release Candidate version is the final version of SQL Server 2017 and adds following features:

1. SQL Server on Linux supports Active Directory Authentication, which enables domain-joined clients on either Windows or Linux to authenticate to SQL Server using their domain credentials and the Kerberos protocol.

2. SQL Server on Linux can use TLS to encrypt data that is transmitted across a network between a client application and an instance of SQL Server.

3. Added more model management capabilities for R Services on Windows Server, including External Library Management. The new release also supports Native Scoring.

4. Additional DMVs, enabling dependency analysis and reporting (DISCOVER_CALC_DEPENDENCY, MDSCHEMA_MEASUREGROUP_DIMENSIONS).

5. Added support for SSIS scale out in HA environments, customers can now enable Always On for SSIS.

6. Features already rolled out in previous CTP versions:

– All new features added in SQL Server 2016 SP1, [link].

– New features added in SQL Server 2017 CTP 1.x, [link].

– New features added in SQL Server 2017 CTP 2.x, [link].
 

–> Videos on SQL Server 2017

Download & Install SQL Server 2017 & SSMS on Windows

Install SQL Server on Linux
 

–> References:

>> SQL Server 2017 official Page

>> Docs for SQL Server 2017


SQL DBA – Query to check Status of last running SQL Jobs


 
To check the status of current running jobs you can use the “Job Activity Monitor” feature in SSMS, but sometimes opening and refreshing the tool takes time.

Here is a simple query you can run in SSMS or from any custom tool to get the status of current running jobs:
 

–> Query #1: This query uses sysjobs view to get the list of all jobs and sysjobhistory view to get the latest status record.

SELECT 
	j.name AS JobName
	,CONVERT(VARCHAR,DATEADD(S,(run_time/10000)*60*60 /* hours */  
          +((run_time - (run_time/10000) * 10000)/100) * 60 /* mins */  
          + (run_time - (run_time/100) * 100)  /* secs */
           ,CONVERT(DATETIME,RTRIM(run_date),113)),100) AS RunTimeStamp
	,CASE 
		WHEN j.enabled = 1 THEN 'Enabled'  
		ELSE 'Disabled'  
	END JobStatus
	,CASE 
		WHEN jh.run_status = 0 THEN 'Failed'
		WHEN jh.run_status = 1 THEN 'Succeeded'
		WHEN jh.run_status = 2 THEN 'Retry'
		WHEN jh.run_status = 3 THEN 'Cancelled'
		ELSE 'Unknown'  
	END JobRunStatus
FROM msdb.dbo.sysjobs j
OUTER APPLY(
	SELECT TOP 1 * 
	FROM msdb.dbo.sysjobhistory jh
	WHERE jh.job_id = j.job_id AND jh.step_id = 0 
	order by jh.instance_id DESC) jh
where j.name like 'ADF%'
ORDER BY j.name, jh.run_date, jh.run_time 

 

–> Query #2; This query uses sysjobs view to get the list of all jobs and sysjobservers view to get the current status of job.

SELECT 
	j.name AS JobName
	,IIF(js.last_run_date > 0, 
		DATETIMEFROMPARTS(js.last_run_date/10000, js.last_run_date/100%100, js.last_run_date%100, 
		js.last_run_time/10000, js.last_run_time/100%100, js.last_run_time%100, 0), 
		NULL) AS RunTimeStamp
	,CASE 
		WHEN j.enabled = 1 THEN 'Enabled'  
		ELSE 'Disabled'  
	END JobStatus
	,CASE 
		WHEN js.last_run_outcome = 0 THEN 'Failed'
		WHEN js.last_run_outcome = 1 THEN 'Succeeded'
		WHEN js.last_run_outcome = 2 THEN 'Retry'
		WHEN js.last_run_outcome = 3 THEN 'Cancelled'
		ELSE 'Unknown'  
	END JobRunStatus
FROM msdb.dbo.sysjobs j
JOIN msdb.dbo.sysjobservers js on js.job_id = j.job_id
where j.name like 'ADF%'
ORDER BY j.name, js.last_run_date, js.last_run_time 

 

And if you noted in both the queries above, I’ve used different way to calculate the RunTimeStamp, first by parsing the rum_time column, second by using DateTimeFromParts() function.


Categories: DBA Stuff Tags: ,

SQL DBA – SQL Agent Job history not showing or vanishing away


 
This happened when we started working on a new SQL Server instance for our DEV environment. The history of SQL jobs was not getting retained after a day or few runs.

I checked on the SQL Agent Properties and found that there are some config values set which were very low:
jobhistory_max_rows = 10000
jobhistory_max_rows_per_job = 100

So I checked on MSDN and found that the max permissible values and set them to 999999 & 49999 respectively.

–> This can also be changed by below T-SQL Query by using the system SP sp_set_sqlagent_properties:

USE [msdb]
GO

EXEC msdb.dbo.sp_set_sqlagent_properties 
	@jobhistory_max_rows=999999, 
	@jobhistory_max_rows_per_job=49999
GO

Categories: DBA Stuff Tags: ,

SQL DBA – Stop multiple running SQL Jobs at once


 
Just like my previous post on Enable/Disable multiple SQL Jobs [link], here I will show how can we stop multiple jobs running by SQL Agent at once.

–> Although you can very well Stop a SQL Job in SSMS by simple right-clicking on it. But if you have multiple SQL Jobs running, to stop each of them will be a cumbersome and time taking task.

I will show how this can be done by a T-SQL query:

USE msdb;
GO

-- Stop Multiple running SQL Jobs at once:
DECLARE @dynSql NVARCHAR(MAX) = ''

SELECT @dynSql += N' msdb.dbo.sp_stop_job @job_name = ''' + j.name + N'''' 
			+ CHAR(10) + CHAR(13)
FROM msdb.dbo.sysjobs j
JOIN msdb.dbo.sysjobactivity AS ja 
ON ja.job_id = j.job_id
WHERE ja.start_execution_date IS NOT NULL
AND ja.stop_execution_date IS NULL
ORDER BY j.name;

PRINT @dynSql;
GO

Simply Copy-Paste the the above Dynamically generated SQL Script and Execute it, it will Stop all Running SQL Jobs at once.


Categories: DBA Stuff, SQL Tips Tags: ,