Creating new columns from pandas.DataFrame.groupby(['arg1', 'arg2']) with mean values

Question:

I have data similar to the following in a pandas.DataFrame:

df = pd.DataFrame({
    'Year' : [2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002],
    'Month' : ['Aug', 'Aug', 'Sep', 'Sep', 'Aug', 'Aug', 'Sep', 'Sep'],
    'Day' : [1, 2, 1, 2, 1, 2, 1, 2],
    'Value' : [1, 2, 3, 4, 5, 6, 7, 8]  })

Now I group by ‘Month’ and ‘Year’, and calculate the mean value:

print(df.groupby(['Month', 'Year'])['Value'].mean())

The output looks like:

Month Year
Aug 2001 1.5
2002 5.5
Sep 2001 3.5
2002 7.5

Now I want to create a new data frame, that looks like this:

Year Aug Sep
2001 1.5 3.5
2002 5.5 7.5

Are there any functions in the pandas module that could help me with this? Thanks in advance!

Asked By: Niklas

||

Answers:

You can do like this using pivot_table:

table = pd.pivot_table(df, values='Value', index=['Year'],
                columns=['Month'], aggfunc=np.mean)

Regards,
Jehona.

Answered By: Jehona Kryeziu

OP is not far from the desired goal. As one is using pandas.DataFrame.groupby and pandas.Series.mean, all one has to do is use pandas.DataFrame.unstack as follows

df_new = df.groupby(['Year', 'Month'])['Value'].mean().unstack()

[Out]:

Month  Aug  Sep
Year           
2001   1.5  3.5
2002   5.5  7.5

enter image description here

Answered By: Gonçalo Peres
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.