Archive

Posts Tagged ‘PIVOT’

Python error – Length of passed values is 6, index implies 2 (while doing PIVOT with MultiIndex or multiple columns)

April 14, 2021 Leave a comment

 

As I’m new to Python and these days using it for some Data Analysis & Metadata handling purpose, and also being from SQL background here I’m trying to use as many analysis features I use with SQL, like Group By, Aggregate functions, Filtering, Pivot, etc.

 

Now I had this particular requirement to PIVOT some columns based on multi-index keys, or multiple columns as shown below:

Python PIVOT

But while using PIVOT function in Python I was getting this weird error that I was not able to understand, because this error was not coming with the use of single index column.

ValueError: Length of passed values is 6, index implies 2.

 

Let’s create a sample Pandas DataFrame for our demo purpose:

import pandas as pd

sampleData = {
  'Country': ['India', 'India', 'India','India','USA','USA'],
  'State': ['UP','UP','TS','TS','CO','CO'],
  'CaseStatus': ['Active','Closed','Active','Closed','Active','Closed'],
  'Cases': [100, 150, 300, 400, 200, 300]}

df = pd.DataFrame(sampleData)

df

 

Problem/Issue:

But when I tried using the below code which uses pivot() function with multiple columns or multi-indexes it started throwing error. I was not getting error when I used single index/column below. So what could be the reason?

# Passing multi-index (or multiple columns) fails here:
df_pivot = df.pivot(index = ['Country','State'], 
                    columns = 'CaseStatus', 
                    values = 'Cases').reset_index()

df_pivot

 

Solution #1

Online checking I found that the pivot() function only accepts single column index key (do not accept multiple columns list as index). So, in this case first we would need to use set_index() function and set the list of columns as shown below:

# Use pivot() function with set_index() function:
df_pivot = (df.set_index(['Country', 'State'])
        .pivot(columns='CaseStatus')['Cases']
        .reset_index()
     )

df_pivot

 

Solution #2

There is one more simple option where you can use pivot_table() function as shown below and get the desired output:

# or use pivot_table() function:
df_pivot = pd.pivot_table(df, 
                          index = ['Country', 'State'], 
                          columns = 'CaseStatus', 
                          values = 'Cases')

df_pivot

Advertisement
Categories: Python Tags: , , ,

Creating “Dynamic PIVOT” scripts in SQL Server

January 25, 2011 24 comments

My previous PIVOT UNPIVOT post describes to organize data in a cross tab fashion or to transpose columns into rows and vice-versa.

This is fine only when you know how many columns you need. But when columns vary according to the query then how could you apply them on your scripts. The only way is to store them in a string variable at runtime and apply them in a dynamic SQL query, shown below.

This problem was also discussed on MSDN’s following link: http://social.msdn.microsoft.com/Forums/en/transactsql/thread/26f86fd6-6d06-4cc5-9723-9ee4685bd48a where I proposed an answer also described below:

USE [tempdb]
GO

-- Create test tables
create table table1 (number int, desc varchar(20),
location int, numberatlocation int)
create table table2 (code int, name varchar(20))

-- Insert test data
insert into table1 values (12345,'test',1000,5)
insert into table1 values (12345,'test',1001,2)
insert into table1 values (12345,'test',1002,4)
insert into table1 values (12345,'test',1003,9)
insert into table1 values (12345,'test',1004,7)

insert into table2 values (1000,'loc1')
insert into table2 values (1001,'loc2')
insert into table2 values (1002,'loc3')
insert into table2 values (1003,'loc4')
insert into table2 values (1004,'loc5')

-- Static PIVOT
select number, description, [loc1], [loc2], [loc3], [loc4], [loc5]
from (select number, desc, numberatlocation, name
from table1 join table2 on table1.location=table2.code)p
PIVOT(MAX (numberatlocation) FOR Name IN ( [loc1], [loc2], [loc3], [loc4], [loc5] )
) AS pvt
ORDER BY number
Output of Static query:
number	desc	loc1	loc2	loc3	loc4	loc5
12345	test    5       2       4       9       7
-- Dynamic PIVOT

-- Lets add one more record on both the tables to check the results
insert into table1 values (12345,'test',1005,3)
insert into table2 values (1005,'loc6')

declare @col varchar(1000)
declare @sql varchar(2000)

select @col = COALESCE(@col + ', ','') + QUOTENAME(name)
from table2

select @col -- This gives: [loc1], [loc2], [loc3], [loc4], [loc5], [loc6]

-- Now setting this @col variable in the Dynamic SQL.
set @sql = '
select number, desc, ' + @col + '
from (select number, desc, numberatlocation, name
from table1 join table2 on table1.location=table2.code)p
PIVOT(MAX (numberatlocation) FOR Name IN ( ' + @col + ' )
) AS pvt
ORDER BY number'

print @sql

exec (@sql)
Output of Dynamic query:
number	desc	loc1	loc2	loc3	loc4	loc5    loc6
12345	test    5       2       4       9       7       3
-- Final Cleanup
drop table table1
drop table table2
Categories: SQL Tips Tags: , ,

MS SQL Server 2005 – New Feature | PIVOT and UNPIVOT

April 12, 2009 7 comments

Using PIVOT and UNPIVOT Operator

You can use the PIVOT and UNPIVOT relational operators to manipulate a table-valued expression into another table. PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where necessary on any remaining column values that are desired in the final output. UNPIVOT performs the opposite operation to PIVOT by rotating columns of a table-valued expression into column values.

PIVOT provides syntax that is simpler and more readable than what may otherwise be specified in a complex series of SELECT…CASE statements.
A common scenario where PIVOT can be useful is when you want to generate cross-tabulation reports to summarize data. For example, suppose you want to query the PurchaseOrderHeader table in the AdventureWorks sample database to determine the number of purchase orders placed by certain employees. The following query provides this report, broken down by vendor:

USE AdventureWorks;
GO
SELECT VendorID, [164] AS Emp1, [198] AS Emp2, [223] AS Emp3, [231] AS Emp4, [233] AS Emp5
FROM
(SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader) p
PIVOT
(
COUNT (PurchaseOrderID)
FOR EmployeeID IN
( [164], [198], [223], [231], [233] )
) AS pvt
ORDER BY VendorID
Output:
VendorID Emp1 Emp2 Emp3 Emp4 Emp5
1 	 4    3    5    4    4
2 	 4    1    5    5    5
3 	 4    3    5    4    4
4 	 4    2    5    5    4
5 	 5    1    5    5    5

This means that the unique values returned by the EmployeeID column themselves become fields in the final result set. As a result, there is a column for each EmployeeID number specified in the pivot clause — in this case employees 164, 198, 223, 231, and 233. The PurchaseOrderID column serves as the value column, against which the columns returned in the final output, called the grouping columns, are grouped. In this case, the grouping columns are aggregated by the COUNT function. Note that a warning message appears indicating that any NULL values appearing in the PurchaseOrderID column were not considered when computing the COUNT for each employee.

UNPIVOT performs almost the reverse operation of PIVOT, by rotating columns into rows. Suppose the table produced in the example above is stored in the database as pvt, and you want to rotate the column identifiers Emp1, Emp2, Emp3, Emp4, and Emp5 into row values that correspond to a particular vendor. This means that you must identify two additional columns. The column that will contain the column values you are rotating (Emp1, Emp2,…) will be called Employee, and the column that will hold the values that currently reside under the columns being rotated will be called Orders. These columns correspond to the pivot_column and value_column, respectively, in the Transact-SQL definition. The query looks like this.

Create the table and insert values as portrayed in the above example:

CREATE TABLE pvt (VendorID int, Emp1 int, Emp2 int,
Emp3 int, Emp4 int, Emp5 int)
GO
INSERT INTO pvt VALUES (1,4,3,5,4,4)
INSERT INTO pvt VALUES (2,4,1,5,5,5)
INSERT INTO pvt VALUES (3,4,3,5,4,4)
INSERT INTO pvt VALUES (4,4,2,5,5,4)
INSERT INTO pvt VALUES (5,5,1,5,5,5)
GO
--Unpivot the table.
SELECT VendorID, Employee, Orders
FROM
(SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
FROM pvt) p
UNPIVOT
(Orders FOR Employee IN
(Emp1, Emp2, Emp3, Emp4, Emp5)
)AS unpvt
GO
Output:
VendorID Employee Orders
1 	 Emp1 	  4
1 	 Emp2 	  3
1 	 Emp3 	  5
1 	 Emp4 	  4
1 	 Emp5 	  4
2 	 Emp1 	  4
2 	 Emp2 	  1
2 	 Emp3 	  5
2 	 Emp4 	  5
2 	 Emp5 	  5
...

Note: UNPIVOT is not the exact reverse of PIVOT. PIVOT performs an aggregation and hence merges possible multiple rows into a single row in the output. UNPIVOT does not reproduce the original table-valued expression result because rows have been merged. Besides, NULL values in the input of UNPIVOT disappear in the output, whereas there may have been original NULL values in the input before the PIVOT operation.

Reference taken from MS BOL: http://msdn.microsoft.com/en-us/library/ms177410.aspx