Archive

Posts Tagged ‘Aggregate Functions’

SQL Basics – AGGREGATE Functions in SQL Server

December 21, 2016 Leave a comment

 
Aggregate Functions provides us basic calculation over a group or a set of values. Aggregate functions are deterministic functions as they return the same value every time they are called by using a specific set of input values.

The Basic Aggregate Functions used day today in SQL Server are: COUNT, MIN, MAX, SUM, AVG.
 

You can use below query to create an Employee table for this demo:

USE [AdventureWorks2014]
GO

DROP TABLE IF EXISTS dbo.Employee

SELECT
    P.BusinessEntityID AS EmpID, P.Title, P.FirstName, P.MiddleName, P.LastName, 
    E.Gender, E.MaritalStatus, E.BirthDate AS DOB, E.HireDate AS DOJ, E.JobTitle, 
    D.Name AS DeptName, S.Rate
INTO dbo.Employee -- Create a new table here and load query data
FROM [Person].[Person] P
INNER JOIN [HumanResources].[Employee] E 
		ON E.BusinessEntityID = P.BusinessEntityID
CROSS APPLY (
    SELECT TOP 1 DepartmentID
    FROM [HumanResources].[EmployeeDepartmentHistory] DH 
    WHERE DH.BusinessEntityID = E.BusinessEntityID 
    ORDER BY StartDate DESC) EDH
INNER JOIN [HumanResources].[Department] D 
		ON D.DepartmentID = EDH.DepartmentID
INNER JOIN [HumanResources].[EmployeePayHistory] S 
		ON S.BusinessEntityID = P.BusinessEntityID

SELECT * FROM dbo.Employee -- 316
GO

 

1. COUNT() function returns the number of rows in a table or a group, and returns an INT (integer).

-- To get count of all reords in a table:
select COUNT(*) from dbo.Employee -- 316

-- Within Group: get count of records having Gender = 'M':
select COUNT(*) from dbo.Employee WHERE Gender = 'M' -- 228

-- Using DISTINCT with COUNT:
select COUNT(FirstName) from dbo.Employee -- 316
select COUNT(DISTINCT FirstName) from dbo.Employee -- 224

select COUNT(LastName) from dbo.Employee -- 316
select COUNT(DISTINCT LastName) from dbo.Employee -- 270

–> Using COUNT() function with GROUP BY

-- Get count of Employees in every Department:
select 
	DeptName, 
	COUNT(*) as EmpCount
from dbo.Employee
GROUP BY DeptName

 

2. MIN(), MAX() and AVG() functions returns the Minimum, Maximum and Average values from a table or a group.

-- Check the Minimum, Maximum and Average salaries of all Employes:
SELECT 
	MIN(Salary) as minSal, 
	MAX(Salary) as maxSal, 
	AVG(Salary) as avgSal
FROM dbo.Employee

–> Using MIN(), MAX(), AVG() functions with GROUP BY

-- Check Department wise Minimum, Maximum and Average salaries:
SELECT 
	DepartmentName, 
	MIN(Salary) as minSal, 
	MAX(Salary) as maxSal, 
	AVG(Salary) as avgSal
FROM dbo.Employee
GROUP BY DepartmentName

 

3. SUM() function returns the Sum of all values from a table or a group.

-- Check the budget of a company to pay their Employee's salary:
select SUM(Salary) as TotalSalary
from dbo.Employee

–> Using SUM() function with GROUP BY

-- Check Department wise budget of salary:
select 
	DepartmentName, 
	SUM(Salary) as TotalSalary
from dbo.Employee
GROUP BY DepartmentName

 

4. Using all Aggregate function together to get all information in a single query:

SELECT 
	COUNT(*)	as EmpCount, 
	MIN(Salary) as minSal, 
	MAX(Salary) as maxSal, 
	AVG(Salary) as avgSal, 
	SUM(Salary) as TotalSalary
FROM dbo.Employee

-- GROUP BY example with Aggregate functions:</strong>
SELECT 
	DepartmentName, 
	COUNT(*)	as EmpCount, 
	MIN(Salary) as minSal, 
	MAX(Salary) as maxSal, 
	AVG(Salary) as avgSal, 
	SUM(Salary) as TotalSalary
FROM dbo.Employee
GROUP BY DepartmentName

 

–> More Aggregate Functions in SQL Server are: COUNT_BIG, CHECKSUM_AGG, STDEV, STDEVP, GROUPING, GROUPING_ID, VAR, VARP. You can check them here: https://docs.microsoft.com/en-us/sql/t-sql/functions/aggregate-functions-transact-sql
 


Advertisement
Categories: SQL Basics Tags:

SQL Server 2012 (a.k.a Denali) – New feature | Enhanced OVER() Clause – Part1

April 13, 2012 1 comment

OVER() clause in SQL Server was introduced in version 2005 and it was the same in 2008 and even 2008-R2.

Ranking functions like ROW_NUMBER(), RANK(), DENSE_RANK() & NTILE() can use the OVER() clause to Parition and Order a record-set before actually applying their logic. Thus a large record-set can be divided into several partitions and calculations can be applied to each set in a set-based approach rather than going with Loops or Cursors.

With the new version of SQL Server i.e. 2012 the OVER() clause has been extended completely to other aggregates, like SUM(), AVG(), etc. Prior to this the OVER() clause can be partially used for these Aggregate Functions.

Here we will see solving a classic problem of calculating Running Totals with this new feature.
I’ve already discussed about solving this problem with JOINS in version 2005,2008, 2008-R2 in my previous post at following [link].

Let’s see how can we use the SUM() Aggregate Function with OVER() clause to solve this:

USE [AdventureWorks2012]
GO

-- With 2012, calculating Running totals by using Aggregare function SUM() with extended Window Function OVER():
;with RunTot as (
	select 
		row_number() over(order by s.OrderDate) as row,
		s.SalesPersonID, p.FirstName, p.LastName, s.TotalDue, s.OrderDate
	from Sales.SalesOrderHeader s
	join Person.Person p
	on s.SalesPersonID = p.BusinessEntityID
	)
SELECT 
	SalesPersonID, FirstName, LastName, OrderDate, TotalDue, 
	SUM(TotalDue) OVER(partition by SalesPersonID ORDER BY row)
FROM RunTot
order by SalesPersonID, row
GO


-- Prior to 2012 calculating Running totals by using SELF-JOIN:
;with RunTot as (
	select 
		row_number() over(partition by s.SalesPersonID order by s.OrderDate) as row,
		s.SalesPersonID, p.FirstName, p.LastName, s.TotalDue, s.OrderDate
	from Sales.SalesOrderHeader s
	join Person.Person p
	on s.SalesPersonID = p.BusinessEntityID)
select a.row, a.SalesPersonID, a.FirstName, a.LastName, a.OrderDate, a.TotalDue, sum(b.TotalDue) as RunTotal
from RunTot a
join RunTot b on a.SalesPersonID = b.SalesPersonID and a.row >= b.row
group by a.row, a.SalesPersonID, a.FirstName, a.LastName, a.TotalDue, a.OrderDate
order by a.SalesPersonID, a.row

On comparing performance of both the queries within the batch:
– The first query with SUM() window function costs just 38%.
– While the second query without the SUM() window function by using JOINS costs 68%.

I still like to call SQL Server’s latest release by its code name “Denali”, not “SQL Server 2012”.

Stay tuned to other new features of Denali in my forthcomming posts.

For more information about the OVER() clause check MS BOL link here: http://msdn.microsoft.com/en-us/library/ms189461.aspx

Using OVER and PARTION BY clauses with Aggregate Functions in SQL Server

August 10, 2010 5 comments

In my [previous post] we learned about MS SQL Server 2005 Ranking functions.

Here we’ll see more scenarios where they can be used. We can use them with CTE i.e. Common Table Expressions, again a new gem with ver. 2005.

with myCTE as (
    select row_number() over(partition by class order by marks desc) as [RowNumber],
    class, marks, stuName from #tempTable)
select * from myCTE 
where RowNumber < 5
This gives following result:
row_num class marks name
1       A     90    pooja
2       A     90    saurabh
3       A     80    kanchan
4       A     80    manoj
1       B     90    paras
2       B     80    dinesh
3       B     70    hema
4       B     70    kapil

A CTE (Common Table Expression) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE can be self-referencing and can be referenced multiple times in the same query.

More on CTE: http://msdn.microsoft.com/en-us/library/ms190766.aspx

The above mentioned PARTITION BY clause can also be used with SQL aggregate functions, like: AVG, COUNT, MIN, MAX, etc.

Here is an example:

select class, stuName, marks,
    AVG(marks) over(partition by class) AS [avg],
    SUM(marks) over(partition by class) AS [sum],
    MIN(marks) over(partition by class) AS [max],
    MAX(marks) over(partition by class) AS [min],
    COUNT(marks) over(partition by class) AS [count],
    STDEV(marks) over(partition by class) AS [ST_Dev],
    VAR(marks) over(partition by class) AS [VAR]
from #tempTable
This gives following result:
class name     marks AVG  SUM  MIN  MAX  COUNT STDEV   VAR
A     anita    50    76   460  50   90   6     15.0554 226.6667
A     harish   70    76   460  50   90   6     15.0554 226.6667
A     kanchan  80    76   460  50   90   6     15.0554 226.6667
A     manoj    80    76   460  50   90   6     15.0554 226.6667
A     pooja    90    76   460  50   90   6     15.0554 226.6667
A     saurabh  90    76   460  50   90   6     15.0554 226.6667
B     kamar    50    67   470  50   90   7     14.9603 223.8095
B     lalit    50    67   470  50   90   7     14.9603 223.8095
B     nitin    60    67   470  50   90   7     14.9603 223.8095
B     hema     70    67   470  50   90   7     14.9603 223.8095
B     kapil    70    67   470  50   90   7     14.9603 223.8095
B     dinesh   80    67   470  50   90   7     14.9603 223.8095
B     paras    90    67   470  50   90   7     14.9603 223.8095

I’ve rounded the last 2 column values to 4 decimal places.