Python pandas Can I calculate the delta of column values by the ID of another column in minute increments?
Question:
I have a Csv file that looks something like this
Time
Count
Operation
10:01:00
2
Up
10:01:00
5
Down
10:01:00
1
Down
10:01:00
2
Up
10:01:00
1
Up
10:02:00
3
Down
10:02:00
2
Up
10:02:00
5
Down
I want to add up the values of the count column for each minute by the id of the operations column and then for the same minute subtract the up and down sums from each other which should give me something like this
Sum()
:
Time
Count
Operation
10:01:00
5
Up
10:01:00
6
Down
10:02:00
2
Up
10:02:00
8
Down
Diff()
:
Time
Delta
10:01:00
1
10:02:00
6
To do this, I try something like
def Delta_Volume():
df = pd.read_csv(Ex_Csv, usecols=['Time','Count','Operation'], parse_dates=[0])
df['Time'] = df['Time'].dt.floor("T", 0).dt.time
df1 = df.groupby('Operation').sum('Count')
df2 = df.groupby('Operation').diff('Count')
#df['Delt_of_row'] = df.loc[1 : 3,['Count' , 'Operation']].sum(axis = 1)
#df['Delt_of_row'] = df.loc[1 : 3,['Count' , 'Operation']].diff(axis = 1)
print(df1)
But it doesn’t work the way I need unfortunately
Answers:
You can start with a classical GroupBy.sum
, then use the MultiIndex to compute the difference:
df2 = df.groupby(['Operation', 'Time']).sum()
print(df2)
out = df2.loc['Down']-df2.loc['Up']
print(out)
output:
# groupby.sum
Count
Operation Time
Down 10:01:00 6
10:02:00 8
Up 10:01:00 5
10:02:00 2
# difference
Count
Time
10:01:00 1
10:02:00 6
I have a Csv file that looks something like this
Time | Count | Operation |
---|---|---|
10:01:00 | 2 | Up |
10:01:00 | 5 | Down |
10:01:00 | 1 | Down |
10:01:00 | 2 | Up |
10:01:00 | 1 | Up |
10:02:00 | 3 | Down |
10:02:00 | 2 | Up |
10:02:00 | 5 | Down |
I want to add up the values of the count column for each minute by the id of the operations column and then for the same minute subtract the up and down sums from each other which should give me something like this
Sum()
:
Time | Count | Operation |
---|---|---|
10:01:00 | 5 | Up |
10:01:00 | 6 | Down |
10:02:00 | 2 | Up |
10:02:00 | 8 | Down |
Diff()
:
Time | Delta |
---|---|
10:01:00 | 1 |
10:02:00 | 6 |
To do this, I try something like
def Delta_Volume():
df = pd.read_csv(Ex_Csv, usecols=['Time','Count','Operation'], parse_dates=[0])
df['Time'] = df['Time'].dt.floor("T", 0).dt.time
df1 = df.groupby('Operation').sum('Count')
df2 = df.groupby('Operation').diff('Count')
#df['Delt_of_row'] = df.loc[1 : 3,['Count' , 'Operation']].sum(axis = 1)
#df['Delt_of_row'] = df.loc[1 : 3,['Count' , 'Operation']].diff(axis = 1)
print(df1)
But it doesn’t work the way I need unfortunately
You can start with a classical GroupBy.sum
, then use the MultiIndex to compute the difference:
df2 = df.groupby(['Operation', 'Time']).sum()
print(df2)
out = df2.loc['Down']-df2.loc['Up']
print(out)
output:
# groupby.sum
Count
Operation Time
Down 10:01:00 6
10:02:00 8
Up 10:01:00 5
10:02:00 2
# difference
Count
Time
10:01:00 1
10:02:00 6