How to change a Pandas DataFrame into a column Multi-Column?

Question:

I have a Pandas DataFrame with a column index like the one below:

+----+----+----+----+----+----+
|  A1|  A2|  A3|  B1|  B2|  B3|
+----+----+----+----+----+----+
...the data

What I would like to do is to change the column index of this DataFrame to a multi-index one, as shown in the format below, without modifying the data and just simply adding an upper level in the index (with values A and B).

+--------------+--------------+
|        A     |        B     |
+----+----+----+----+----+----+
|  A1|  A2|  A3|  B1|  B2|  B3|
+----+----+----+----+----+----+
...the data

I have tried to use the pandas.MultiIndex function but with no luck. How can this be solved?

Asked By: Marioanzas

||

Answers:

Simpliest is extract first value of columns and assign back in nested lists:

df = pd.DataFrame(0, columns=['A1','A2','A3','B1','B2','B3'], index=[0])

df.columns = [df.columns.str[0], df.columns]
print (df)
   A        B      
  A1 A2 A3 B1 B2 B3
0  0  0  0  0  0  0

If need extract all uppercases from start:

df = pd.DataFrame(0, columns=['ADa1','ADs2','AD3','B1','B2','B3'], index=[0])

df.columns = [df.columns.str.extract('(^[A-Z]+)', expand=False), df.columns]
print (df)

    AD           B      
  ADa1 ADs2 AD3 B1 B2 B3
0    0    0   0  0  0  0

If need set also colums names use MultiIndex.from_arrays:

df = pd.DataFrame(0, columns=['ADa1','ADs2','AD3','B1','B2','B3'], index=[0])

df.columns = pd.MultiIndex.from_arrays([df.columns.str.extract('(^[A-Z]+)', expand=False), 
                                       df.columns], 
                                       names=('a','b'))
print (df)

a   AD           B      
b ADa1 ADs2 AD3 B1 B2 B3
0    0    0   0  0  0  0
Answered By: jezrael

You could extract the first letter separately and create a MultiIndex

multi_index_level_0 = [c[0] for c in df.columns]
multi_index = [multi_index_level_0, df.columns.values]
df.columns = pd.MultiIndex.from_arrays(multi_index)
Answered By: Mortz

Here is an option using map

df.set_axis(df.columns.map(lambda x: (x[0],x)),axis=1)

Output:

   A        B      
  A1 A2 A3 B1 B2 B3
0  0  0  0  0  0  0
Answered By: rhug123