After groupby, how to flatten column headers?

Question

I’m trying to left join multiple pandas dataframes on a single Id column, but when I attempt the merge I get warning:

KeyError: ‘Id’.

I think it might be because my dataframes have offset columns resulting from a groupby statement, but I could very well be wrong. Either way I can’t figure out how to “unstack” my dataframe column headers. None of the answers at this question seem to work.

My groupby code:

step1 = pd.DataFrame(step3.groupby(['Id', 'interestingtabsplittest2__grp'])['applications'].sum())
step1.sort('applications', ascending=False).head(3)

Returns:

How to get those offset headers into the top level?

Asked By: samthebrand

||

Source

Answer 1

You’re looking for .reset_index().

In [11]: df = pd.DataFrame([[2, 3], [5, 6]], pd.Index([1, 4], name="A"), columns=["B", "C"])

In [12]: df
Out[12]:
   B  C
A
1  2  3
4  5  6

In [13]: df.reset_index()
Out[13]:
   A  B  C
0  1  2  3
1  4  5  6

Note: That you can avoid this step by using as_index=False when doing the groupby.

step1 = step3.groupby(['Id', 'interestingtabsplittest2__grp'], as_index=False)['applications'].sum()

Answered By: Andy Hayden

Answer 2

The accepted answer doesn’t work if you do multiple aggregation with .agg() or if you’re grouping by multiple columns

You can instead drop the topmost level(s) and then reset the index.

df.droplevel(axis=1, level=0).reset_index()

Here, I have dropped only one level but you can pass an array instead as well:

df.droplevel(axis=1, level=[0,1]).reset_index()

Answered By: Shayan RC

After groupby, how to flatten column headers?

Question:

Answers: