Pandas rolling values

Question:

How do I obtain the rolling values of some length n of a pandas series of value ?

For example, if I have the following:

df = pd.DataFrame({'temperature': [0, 1, 2, np.nan, 4, 2, 0.8, 4, 8.8, 7.12]})

how do I obtain the moving values of length n, i.e. something like, if n=3:

[NaN, NaN, 0], [NaN, 0, 1],…, [4, 8.8, 7.12]

EDIT:
If I use pandas rolling, as:

roll = pd.Series.rolling(df, 3).mean()

then roll is the moving averages of the series.
Here, I do not want the averages of every moving set of 3 values, but these sets of 3 values.

Asked By: user5805065

||

Answers:

I think you need first add NaNs and then this solution:

N = 3
x = np.concatenate([[np.nan] * (N-1), df['temperature'].values])

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
print (rolling_window(x, N))
[[  nan   nan  0.  ]
 [  nan  0.    1.  ]
 [ 0.    1.    2.  ]
 [ 1.    2.     nan]
 [ 2.     nan  4.  ]
 [  nan  4.    2.  ]
 [ 4.    2.    0.8 ]
 [ 2.    0.8   4.  ]
 [ 0.8   4.    8.8 ]
 [ 4.    8.8   7.12]]
Answered By: jezrael

Even though the thread is old, maybe it will help someone else. I’m a beginner, but I solved user5805065’s question by following procedure. Maybe, someone can make it more elegant and efficient.

  • converting Pandas series to NumPy:
rollTemperature = df['temperature'].values
  • then I’ve created numpy array in a for loop with some initial variables:
period = 2
stop = len(rollTemperature)
diffRoll = np.zeros(stop)

for i in range(0,stop):

    if i == 0:
        diffRoll[i] = np.nan

    elif np.mod(i,period)!=0:
        diffRoll[i] = np.nan

    else:
        diffRoll[i] = (rollTemperature[i] + rollTemperature[i-period])/2
  • than adding numpy array to existin dataFrame:
df['diffRoll'] = diffRoll 

Than the output is:

   temperature  diffRoll
0         0.00       NaN
1         1.00       NaN
2         2.00       1.0
3          NaN       NaN
4         4.00       3.0
5         2.00       NaN
6         0.80       2.4
7         4.00       NaN
8         8.80       4.8
9         7.12       NaN
Answered By: grove.tomas
pd.concat([df1.shift(i) for i in range(3)],axis=1).loc[:,::-1]
    .agg(list,axis=1)

0     [nan, nan, 0.0]
1     [nan, 0.0, 1.0]
2     [0.0, 1.0, 2.0]
3     [1.0, 2.0, nan]
4     [2.0, nan, 4.0]
5     [nan, 4.0, 2.0]
6     [4.0, 2.0, 0.8]
7     [2.0, 0.8, 4.0]
8     [0.8, 4.0, 8.8]
9    [4.0, 8.8, 7.12]
dtype: object
Answered By: G.G
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.