Append new level to DataFrame column
Question:
Given a DataFrame, how can I add a new level to the columns based on an iterable given by the user? In other words, how do I append a new level?
The question How to simply add a column level to a pandas dataframe shows how to add a new level given a single value, so it doesn’t cover this case.
Here is the expected behaviour:
>>> df = pd.DataFrame(0, columns=["A", "B"], index=range(2))
>>> df
A B
0 0 0
1 0 0
>>> append_level(df, ["C", "D"])
A B
C D
0 0 0
1 0 0
The solution should also work with MultiIndex columns, so
>>> append_level(append_level(df, ["C", "D"]), ["E", "F"])
A B
C D
E F
0 0 0
1 0 0
Answers:
def append_level(df, new_level):
new_df = df.copy()
new_df.columns = pd.MultiIndex.from_tuples(zip(*zip(*df.columns), new_level))
return new_df
If the columns is not multiindex, you can just do:
df.columns = pd.MultiIndex.from_arrays([df.columns.tolist(), ['C','D']])
If its multiindex:
if isinstance(df.columns, pd.MultiIndex):
df.columns = pd.MultiIndex.from_arrays([*df.columns.levels, ['E', 'F']])
The pd.MultiIndex.levels
gives a Frozenlist of level values and you need to unpack to form the list of lists as input to from_arrays
Given a DataFrame, how can I add a new level to the columns based on an iterable given by the user? In other words, how do I append a new level?
The question How to simply add a column level to a pandas dataframe shows how to add a new level given a single value, so it doesn’t cover this case.
Here is the expected behaviour:
>>> df = pd.DataFrame(0, columns=["A", "B"], index=range(2))
>>> df
A B
0 0 0
1 0 0
>>> append_level(df, ["C", "D"])
A B
C D
0 0 0
1 0 0
The solution should also work with MultiIndex columns, so
>>> append_level(append_level(df, ["C", "D"]), ["E", "F"])
A B
C D
E F
0 0 0
1 0 0
def append_level(df, new_level):
new_df = df.copy()
new_df.columns = pd.MultiIndex.from_tuples(zip(*zip(*df.columns), new_level))
return new_df
If the columns is not multiindex, you can just do:
df.columns = pd.MultiIndex.from_arrays([df.columns.tolist(), ['C','D']])
If its multiindex:
if isinstance(df.columns, pd.MultiIndex):
df.columns = pd.MultiIndex.from_arrays([*df.columns.levels, ['E', 'F']])
The pd.MultiIndex.levels
gives a Frozenlist of level values and you need to unpack to form the list of lists as input to from_arrays