After groupby, how to flatten column headers?

Question:

I’m trying to left join multiple pandas dataframes on a single Id column, but when I attempt the merge I get warning:

KeyError: ‘Id’.

I think it might be because my dataframes have offset columns resulting from a groupby statement, but I could very well be wrong. Either way I can’t figure out how to “unstack” my dataframe column headers. None of the answers at this question seem to work.

My groupby code:

step1 = pd.DataFrame(step3.groupby(['Id', 'interestingtabsplittest2__grp'])['applications'].sum())
step1.sort('applications', ascending=False).head(3)

Returns:

offset headers

How to get those offset headers into the top level?

Asked By: samthebrand

||

Answers:

You’re looking for .reset_index().

In [11]: df = pd.DataFrame([[2, 3], [5, 6]], pd.Index([1, 4], name="A"), columns=["B", "C"])

In [12]: df
Out[12]:
   B  C
A
1  2  3
4  5  6

In [13]: df.reset_index()
Out[13]:
   A  B  C
0  1  2  3
1  4  5  6

Note: That you can avoid this step by using as_index=False when doing the groupby.

step1 = step3.groupby(['Id', 'interestingtabsplittest2__grp'], as_index=False)['applications'].sum()
Answered By: Andy Hayden

The accepted answer doesn’t work if you do multiple aggregation with .agg() or if you’re grouping by multiple columns

You can instead drop the topmost level(s) and then reset the index.

df.droplevel(axis=1, level=0).reset_index()

Here, I have dropped only one level but you can pass an array instead as well:

df.droplevel(axis=1, level=[0,1]).reset_index()
Answered By: Shayan RC
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.