Calculation of rolling speed in a PandasDataframe

Question

I have the following challenge: I have a PandasDataframe with information about a unique ArucoID, a unique frameID and associated coordinates in a coordinate system. For example like this:

# import pandas library
import pandas as pd
# lst_of_dfs = []
# dictionary with list object of values
data1 = {
     'frameID' : [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
     'xPos' : [10.0, 10.5, 11.0, 12.0, 13, 4.0, 5.0, 6.0, 7.0, 9.0, 1.5, 2.0, 2.5, 3.0, 4.0 ],
     'yPos' : [-0.2, -0.1, -0.1, 0.0, 0.0, 0.2, 0.2, -0.1, 0.0, 0.05, -0.2, -0.1, 0.0, 0.1, 0.05],
     'ArucoID' : [910, 910, 910, 910, 910, 898, 898, 898, 898, 898, 912, 912, 912, 912, 912],
     'Subtrial' : ['01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01']
     }
df1 = pd.DataFrame(data1)

   
data2 = {
     'frameID' : [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
     'xPos' : [9.4, 9.5, 9.0, 9.0, 10, 3.0, 4.0, 5.0, 6.0, 7.0, 2.5, 3.0, 3.5, 3.5, 5.0 ],
     'yPos' : [-0.2, -0.1, -0.1, 0.0, 0.0, 0.2, 0.2, -0.1, 0.0, 0.05, -0.2, -0.1, 0.0, 0.1, 0.05],
     'ArucoID' : [910, 910, 910, 910, 910, 898, 898, 898, 898, 898, 912, 912, 912, 912, 912],
     'Subtrial' : ['02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02']
     }
df2 = pd.DataFrame(data2)

 
lst_of_dfs = [df1,df2]
 
# creating a Dataframe object 
df_TrajData = pd.concat(lst_of_dfs)

#print(df_TrajData)

Now I calculate the distance between the xPos as rolling mean for the DataFrame grouped by ArucoID:

#calculation of current distance of each ArucoID as rolling mean over a window of n frames (n is set as 2 frames for testing)

all_data = []    
df_grouped = df_TrajData.groupby('ArucoID')
for key, data in df_grouped:
    #calc distance covered in window     
    dX = data['xPos'] - data['xPos'].shift(2)
    #print(dX)
       
    data['dX'] = dX
    
    all_data.append(data)
    
df = pd.concat(all_data)
#print(df)

And now I get into trouble: I want to calculate the speed [s]. That would be v = dX / (time[-1] – time[0] / framerate), where time[-1] is last frameID of the rolling window, t[0] current frameID and framerate is 30 frames/per/second.

I was starting with (rolling_window=3, min_periods=1):

df['speed'] = df.groupby('ArucoID')['dX'].transform(lambda x: x.rolling(3, 1).mean())

which is the calculation of the rolling distance. What I would actually like to do would be something like that:

df['speed'] = df.groupby('ArucoID')['dX'].transform(lambda s: s.rolling(3, min_periods=1).mean() / (t[-1] - t[0] /framerate))

#print(df)

Any suggestions would be appreciated. Many thanks in advance!

Asked By: Paul G.

||

Source

Answer 1

I’ll assume you want to compute specific mechanical speeds for each device and trial.

Preparing dataset

Let’s start with your raw data:

import numpy as np
import pandas as pd

data1 = {
    'ArucoID' : [910, 910, 910, 910, 910, 898, 898, 898, 898, 898, 912, 912, 912, 912, 912],
    'Subtrial' : ['01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01', '01'],
    'frameID' : [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
    'xPos' : [10.0, 10.5, 11.0, 12.0, 13, 4.0, 5.0, 6.0, 7.0, 9.0, 1.5, 2.0, 2.5, 3.0, 4.0 ],
    'yPos' : [-0.2, -0.1, -0.1, 0.0, 0.0, 0.2, 0.2, -0.1, 0.0, 0.05, -0.2, -0.1, 0.0, 0.1, 0.05],
}

data2 = {
    'ArucoID' : [910, 910, 910, 910, 910, 898, 898, 898, 898, 898, 912, 912, 912, 912, 912],
    'Subtrial' : ['02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02', '02'],
    'frameID' : [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
    'xPos' : [9.4, 9.5, 9.0, 9.0, 10, 3.0, 4.0, 5.0, 6.0, 7.0, 2.5, 3.0, 3.5, 3.5, 5.0 ],
    'yPos' : [-0.2, -0.1, -0.1, 0.0, 0.0, 0.2, 0.2, -0.1, 0.0, 0.05, -0.2, -0.1, 0.0, 0.1, 0.05],
}

df = pd.concat([
    pd.DataFrame(data1),
    pd.DataFrame(data2) 
])

The key is to shift your position records in order to be able to compute distance.

To do so, we sort records in natural order for this operation and then lag them by device and trial:

df = df.sort_values(["ArucoID", "Subtrial", "frameID"])
shifted = df.groupby(["ArucoID", "Subtrial"]).shift(-1)
shifted = shifted.drop("frameID", axis=1).rename(columns=lambda x: x + "_")
data = pd.concat([df, shifted], axis=1)

Now your data are properly aligned:

#     ArucoID Subtrial  frameID  xPos  yPos  xPos_  yPos_
# 5       898       01        1   4.0  0.20    5.0   0.20
# 6       898       01        2   5.0  0.20    6.0  -0.10
# 7       898       01        3   6.0 -0.10    7.0   0.00
# 8       898       01        4   7.0  0.00    9.0   0.05
# 9       898       01        5   9.0  0.05    NaN    NaN
# 5       898       02        1   3.0  0.20    4.0   0.20
# ...

Speed computations

Distance

Then we can compute the euclidean distance easily:

def distance(x):
    return np.sqrt(np.power(x["xPos"] - x["xPos_"], 2) + np.power(x["yPos"] - x["yPos_"], 2))

data["dist"] = data.apply(distance, axis=1)

Point estimates for speed

And at the same time point estimate and moving averaged speeds:

data["point_speed"] = data["dist"]/(1/30)
data["mov_speed"] = data.groupby(["ArucoID", "Subtrial"]).rolling(3, min_periods=1).mean()["point_speed"].values

    # ArucoID Subtrial  frameID  xPos  yPos  xPos_  yPos_      dist point_speed  mov_speed  
# 5       898       01        1   4.0  0.20    5.0   0.20  1.000000   30.000000  30.000000  
# 6       898       01        2   5.0  0.20    6.0  -0.10  1.044031   31.320920  30.660460  
# 7       898       01        3   6.0 -0.10    7.0   0.00  1.004988   30.149627  30.490182  
# 8       898       01        4   7.0  0.00    9.0   0.05  2.000625   60.018747  40.496431  
# 9       898       01        5   9.0  0.05    NaN    NaN       NaN         NaN  45.084187  
# 5       898       02        1   3.0  0.20    4.0   0.20  1.000000   30.000000  30.000000

Average speed

After that we can aggregate by device and trial to get the total distance and the number of frames:

final = data.groupby(["ArucoID", "Subtrial"]).agg({"dist": "sum", "frameID": "count"}).rename(columns={"frameID": "count"})

#                       dist  count
# ArucoID Subtrial                 
# 898     01        5.049643      5
#         02        4.050267      5
# 910     01        3.014890      5
#         02        1.741421      5
# 912     01        2.530955      5
#         02        2.620637      5

We can also compute average mechanical speed of each device and trial:

def speed(x, frame_time=1.):
    return x["dist"]/((x["count"] - 1)*frame_time)

final["speed"] = final.apply(speed, axis=1, frame_time=1/30)

#                       dist  count      speed
# ArucoID Subtrial                            
# 898     01        5.049643      5  37.872323
#         02        4.050267      5  30.377006
# 910     01        3.014890      5  22.611671
#         02        1.741421      5  13.060660
# 912     01        2.530955      5  18.982163
#         02        2.620637      5  19.654778

And merge together all information:

final = data.merge(final["avg_speed"], left_on=["ArucoID", "Subtrial"], right_index=True)
final["speed_ratio"] = final["mov_speed"]/final["avg_speed"]
final["speed_excess"] = 1. - final["speed_ratio"]

To get average speeds aligned with all records.

Post processing

Finally we can pivot those records to easily navigate and render them:

cross = final.pivot_table(index="frameID", columns=["ArucoID", "Subtrial"], values=["point_speed", "mov_speed", "avg_speed", "speed_ratio", "speed_excess"])

For the specific dataset you provided we have the following moving averages:

And how it compares with average speed:

Answered By: jlandercy