Archive

Archive for April 14, 2021

Python error – Length of passed values is 6, index implies 2 (while doing PIVOT with MultiIndex or multiple columns)

April 14, 2021 Leave a comment

 

As I’m new to Python and these days using it for some Data Analysis & Metadata handling purpose, and also being from SQL background here I’m trying to use as many analysis features I use with SQL, like Group By, Aggregate functions, Filtering, Pivot, etc.

 

Now I had this particular requirement to PIVOT some columns based on multi-index keys, or multiple columns as shown below:

Python PIVOT

But while using PIVOT function in Python I was getting this weird error that I was not able to understand, because this error was not coming with the use of single index column.

ValueError: Length of passed values is 6, index implies 2.

 

Let’s create a sample Pandas DataFrame for our demo purpose:

import pandas as pd

sampleData = {
  'Country': ['India', 'India', 'India','India','USA','USA'],
  'State': ['UP','UP','TS','TS','CO','CO'],
  'CaseStatus': ['Active','Closed','Active','Closed','Active','Closed'],
  'Cases': [100, 150, 300, 400, 200, 300]}

df = pd.DataFrame(sampleData)

df

 

Problem/Issue:

But when I tried using the below code which uses pivot() function with multiple columns or multi-indexes it started throwing error. I was not getting error when I used single index/column below. So what could be the reason?

# Passing multi-index (or multiple columns) fails here:
df_pivot = df.pivot(index = ['Country','State'], 
                    columns = 'CaseStatus', 
                    values = 'Cases').reset_index()

df_pivot

 

Solution #1

Online checking I found that the pivot() function only accepts single column index key (do not accept multiple columns list as index). So, in this case first we would need to use set_index() function and set the list of columns as shown below:

# Use pivot() function with set_index() function:
df_pivot = (df.set_index(['Country', 'State'])
        .pivot(columns='CaseStatus')['Cases']
        .reset_index()
     )

df_pivot

 

Solution #2

There is one more simple option where you can use pivot_table() function as shown below and get the desired output:

# or use pivot_table() function:
df_pivot = pd.pivot_table(df, 
                          index = ['Country', 'State'], 
                          columns = 'CaseStatus', 
                          values = 'Cases')

df_pivot

Categories: Python Tags: , , ,