Combine two pandas index slices

Question:

How can two pandas.IndexSlice s be combined into one?

Set up of the problem:

import pandas as pd
import numpy as np

idx = pd.IndexSlice
cols = pd.MultiIndex.from_product([['A', 'B', 'C'], ['x', 'y'], ['a', 'b']])
df = pd.DataFrame(np.arange(len(cols)*2).reshape((2, len(cols))), columns=cols)

df:
    A               B               C            
    x       y       x       y       x       y    
    a   b   a   b   a   b   a   b   a   b   a   b
0   0   1   2   3   4   5   6   7   8   9  10  11
1  12  13  14  15  16  17  18  19  20  21  22  23

How can the two slices idx['A', 'y', :] and idx[['B', 'C'], 'x', :], be combined to show in one dataframe?

Separately they are:

df.loc[:, idx['A', 'y',:]]
    A    
    y    
    a   b
0   2   3
1  14  15


df.loc[:, idx[['B', 'C'], 'x', :]]
    B       C    
    x       x    
    a   b   a   b
0   4   5   8   9
1  16  17  20  21

Simply combining them as a list does not play nicely:

df.loc[:, [idx['A', 'y',:], idx[['B', 'C'], 'x',:]]]
....
TypeError: unhashable type: 'slice'

My current solution is incredibly clunky, but gives the sub df that I’m looking for:

df.loc[:, df.loc[:, idx['A', 'y', :]].columns.to_list() + df.loc[:,
       idx[['B', 'C'], 'x', :]].columns.to_list()]
    A       B       C    
    y       x       x    
    a   b   a   b   a   b
0   2   3   4   5   8   9
1  14  15  16  17  20  21

However this doesn’t work when one of the slices is just a series (as expected), which is less fun:

df.loc[:, df.loc[:, idx['A', 'y', 'a']].columns.to_list() + df.loc[:,
       idx[['B', 'C'], 'x', :]].columns.to_list()]
...
AttributeError: 'Series' object has no attribute 'columns'

Are there any better alternatives to what I’m currently doing that would ideally work with dataframe slices and series slices?

Asked By: AReubens

||

Answers:

General solution is join together both slice:

a = df.loc[:, idx['A', 'y', 'a']]
b = df.loc[:, idx[['B', 'C'], 'x', :]]

df = pd.concat([a, b], axis=1)
print (df)
    A   B       C    
    y   x       x    
    a   a   b   a   b
0   2   4   5   8   9
1  14  16  17  20  21
Answered By: jezrael

One option is with pyjanitor select_columns to select via tuples:

# pip install pyjanitor
import pandas as pd

df.select_columns(('A','y'), ('B','x'), ('C','x'))
    A       B       C    
    y       x       x    
    a   b   a   b   a   b
0   2   3   4   5   8   9
1  14  15  16  17  20  21
Answered By: sammywemmy
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.