Python – data.to_csv output format

Question:

From a csv file having the following format:

Date,Data
01-01-01,111
02-02-02,222
03-03-03,333

I am calculating the monthly average of the values using the following code:

data = pd.read_csv("input.csv")
data['Month'] = pd.DatetimeIndex(data.reset_index()['Date']).month
mean_data = data.groupby('Month').mean()

Then I output a csv file using the following command:

mean_data.to_csv("test.csv")

It works fine and give me the following output:

Month,Data
01,01
02,02
03,03
04,04
...

But now I would like to know how many data have been included inside the monthly average calculation. For that I changed:

mean_data = data.groupby('Month').mean()

by:

mean_data = data.groupby(['Month']).agg(['mean', 'count'])

But the problem comes now. When I want to output the csv , I now have a weird format as follow:

      Data,Data,
      mean,count,
Month,
01,     01,8,
02,     02,9,
03,     03,7,
04,     04,5,

Which is not really convenient. Instead I would like to have the following output:

Month,Mean,Count
01,01,8
02,02,9
03,03,7
04,04,5

Does anyone know how to achieve that?

Asked By: steve

||

Answers:

Need specify column after groupby:

#convert first column to datetime
data = pd.read_csv("input.csv", parse_dates=[0])

df['Month'] = df['Date'].dt.month
mean_data = data.groupby('Month')['Data'].agg(['mean', 'count'])

should be simplify:

mean_data = data.groupby(df['Date'].dt.month)['Data'].agg(['mean', 'count'])
Answered By: jezrael