# Python Pandas: Calculate moving average within group

## Question:

I have a dataframe containing time series for 100 objects:

``````object  period  value
1       1       24
1       2       67
...
1       1000    56
2       1       59
2       2       46
...
2       1000    64
3       1       54
...
100     1       451
100     2       153
...
100     1000    21
``````

I want to calculate moving average with window 10 for the `value` column. I guess I have to do something like

``````df.groupby('object').apply(lambda ~calculate MA~)
``````

and then merge this Series to the original dataframe by object? Can’t figure out exact commands

You can use rolling with `transform`:

``````df['moving'] = df.groupby('object')['value'].transform(lambda x: x.rolling(10, 1).mean())
``````

The `1` in `rolling` is for minimum number of periods.

You can use `rolling` on `groupby` object directly as:

``````df['moving'] = df.groupby('object').rolling(10)['value'].mean()
``````

The new pandas version throws an error when used direct assign to the column so use:

``````df['moving'] = df.groupby('object').rolling(10)['value'].mean().reset_index(drop=True)
``````

``````df['moving'] = df.groupby('object').rolling(10)['value'].mean().reset_index(drop=True)
``````

The reason for `reset_index` is because after `df.groupby` we end up with a Multi Level Index and at the assignment we will get error `TypeError: incompatible index of inserted column with frame index`

Create a column as a chain method:

``````(
df
.assign(
column_name = lambda x:
x
.groupby(['object'])['value']
.transform(lambda x: x.rolling(10)
.mean())
)
)
``````

The answers provided may not produce the desired results if you are grouping on multiple columns.

The following should cut it:

``````df['moving'] = df.groupby(['col_1', 'col_2', 'col_3']).rolling(10)['value'].mean().droplevel(level=[0,1,2])
``````

These solutions assume the dataframe is sorted in a particular way (by object and period). For example, if the data were organized in panels (by period and object), then the assignment will fail. One general solution irrespective of sorting order is the following:

``````df.loc[:, 'value_sma_10'] = df.groupby(by='object')[['object', 'period']].rolling(window=10, min_periods=1, on='period').mean().reset_index(level='object')['value']
``````
Categories: questions
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.