Adding a another level of columns for a pandas data frame

Question:

I have a data frame that consists of 60 columns; and for the sake of illustration I will be showing an example data frame that looks like what I have but much shorter. the example data frame looks like this:

       0         1     ...     0         1     ...
0 -0.611064 -0.032586  ... -0.102049  1.582183 ...

what I want is to add another level of columns on top of the existing columns so it becomes something like this:

           A                       B
      0         1     ...     0         1      ... 
0 -0.611064 -0.032586 ...  -0.102049  1.582183 ... 

I have tried the following:

df.columns = pd.MultiIndex.from_product([['A','B'], df.columns])

but I got an error that says

ValueError: Length mismatch: Expected axis has 60 elements, new values have 120 elements

after some search I knew that error was due to that the number of columns that I am trying to assign is larger than the existing columns but still I haven’t been able so solve this problem.

I also tried several method like pd.MultiIndex.from_tuples and pd.MultiIndex.from_arrays with other error produced.

edit:
here is an reproducible example

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(1,4), columns=[0,1,0,1])
df.columns = pd.MultiIndex.from_product([['A','B'], df.columns])
print(df)

can any one point to a solution to this problem?
thanks in advance.

Asked By: Adel Moustafa

||

Answers:

IIUC, you just need to create a repeating sequence of A‘s and B‘s, with the length of the sequence equal to the number of your current columns. You can do it using numpy:

import numpy as np

df.columns = pd.MultiIndex.from_arrays(
    [np.repeat(['A','B'], df.shape[1] // 2), df.columns]
)

The first array above will look like ['A','A','B','B',...].


Update:

In response to your follow-up question in the comments below, if you have an odd number of columns, you can utilize np.resize to create a full array of alternating A‘s and B‘s (i.e. ['A','A','B','B','A','A',...]) with the length equal to the number of columns in your dataframe. Then, you can pass that array to pd.MultiIndex.from_arrays():

df.columns = pd.MultiIndex.from_arrays(
    [np.resize(np.repeat(['A','B'], 2), df.shape[1]), df.columns]
)
Answered By: AlexK
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.