Pandas MultiIndex updating with derived values
Question:
I am tryng to update a MultiIndex frame with derived data.
My multiframe is a time series where ‘Vehicle_ID’ and ‘Frame_ID’ are the levels of index and I iterate through each Vehicle_ID in order and compute exponential weighted avgs to clean the data and try to merge the additional columns to the original MultiIndex dataframe.
Example Code:
v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
for id in v_ids:
ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()
smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
trajec.join(smooth)
And this works outside of the loop, to join the values to the trajec dataframe. But when implemented in the loop seems to overwrite on each loop.
Local_X, Local_Y, v_Length, v_Width, v_Class, v_Vel, v_Acc, Lane_ID, Preceding, Following, Space_Headway, Time_Headway
Vehicle_ID Frame_ID
1 12 16.884 48.213 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
13 16.938 49.463 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
14 16.991 50.712 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
15 17.045 51.963 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
16 17.098 53.213 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
... ... ... ... ... ... ... ... ... ... ... ... ... ...
2911 8588 53.693 1520.312 14.9 5.9 2 31.26 0.0 5 2910 2915 78.19 2.50
8589 53.719 1523.437 14.9 5.9 2 31.26 0.0 5 2910 2915 78.26 2.50
8590 53.746 1526.564 14.9 5.9 2 31.26 0.0 5 2910 2915 78.41 2.51
8591 53.772 1529.689 14.9 5.9 2 31.26 0.0 5 2910 2915 78.61 2.51
8592 53.799 1532.830 14.9 5.9 2 30.70 5.9 5 2910 2915 78.81 2.57
dataframe exerpt.
Answers:
You can create an empty dataframe outside the loop to store the results, and then concatenate the results from each iteration to this empty dataframe.
v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
results = pd.DataFrame() # empty dataframe to store results
for id in v_ids:
ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()
smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
results = pd.concat([results, smooth]) # concatenate results from each iteration
# join the results to the original dataframe
trajec = trajec.join(results)
I am tryng to update a MultiIndex frame with derived data.
My multiframe is a time series where ‘Vehicle_ID’ and ‘Frame_ID’ are the levels of index and I iterate through each Vehicle_ID in order and compute exponential weighted avgs to clean the data and try to merge the additional columns to the original MultiIndex dataframe.
Example Code:
v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
for id in v_ids:
ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()
smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
trajec.join(smooth)
And this works outside of the loop, to join the values to the trajec dataframe. But when implemented in the loop seems to overwrite on each loop.
Local_X, Local_Y, v_Length, v_Width, v_Class, v_Vel, v_Acc, Lane_ID, Preceding, Following, Space_Headway, Time_Headway
Vehicle_ID Frame_ID
1 12 16.884 48.213 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
13 16.938 49.463 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
14 16.991 50.712 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
15 17.045 51.963 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
16 17.098 53.213 14.3 6.4 2 12.50 0.0 2 0 0 0.00 0.00
... ... ... ... ... ... ... ... ... ... ... ... ... ...
2911 8588 53.693 1520.312 14.9 5.9 2 31.26 0.0 5 2910 2915 78.19 2.50
8589 53.719 1523.437 14.9 5.9 2 31.26 0.0 5 2910 2915 78.26 2.50
8590 53.746 1526.564 14.9 5.9 2 31.26 0.0 5 2910 2915 78.41 2.51
8591 53.772 1529.689 14.9 5.9 2 31.26 0.0 5 2910 2915 78.61 2.51
8592 53.799 1532.830 14.9 5.9 2 30.70 5.9 5 2910 2915 78.81 2.57
dataframe exerpt.
You can create an empty dataframe outside the loop to store the results, and then concatenate the results from each iteration to this empty dataframe.
v_ids = trajec.index.get_level_values('Vehicle_ID').unique().values
results = pd.DataFrame() # empty dataframe to store results
for id in v_ids:
ewm_x = trajec.loc[(id,), 'Local_X'].ewm(span=T_pos/dt).mean()
ewm_y = trajec.loc[(id,), 'Local_Y'].ewm(span=T_pos_x/dt).mean()
smooth = pd.DataFrame({'Vehicle_ID': id, 'Frame_ID': ewm_y.index.values, 'ewm_y': ewm_y, 'ewm_x': ewm_x}).set_index(['Vehicle_ID', 'Frame_ID'])
results = pd.concat([results, smooth]) # concatenate results from each iteration
# join the results to the original dataframe
trajec = trajec.join(results)