How to divide one row by another with same value in a specific column?
Question:
I have a dataframe following this pattern:
Measure
Key
mai/21
jun/21
jul/21
…
mar/22
apr/22
Units
4040
100
102
104
…
200
204
Total Value
4040
10500
10700
11000
…
22000
22200
Units
3230
120
130
140
…
230
240
Total Value
3230
120000
130000
140000
…
345000
360000
I need to get a average value per unit and per key, so for every key I have in the dataframe, I need to get the total value row and divide by units row, so I get the average value per unit of each month, and the results needs to follow the same as the input. Just like that:
Measure
Key
mai/21
jun/21
jul/21
…
mar/22
apr/22
Value per Units
4040
105
104.90
105.77
…
110
108.82
Value per Units
3230
1000
1000
1000
…
1500
1500
How can I do it?
Thanks.
Answers:
You can do:
df2 = df.set_index('Measure').groupby('Key').apply(lambda x: (x.T['Total Value'] / x.T['Units']).drop('Key')).reset_index()
df2['Measure'] = 'Value per units'
df2[df.columns]
For better understanding what lambda does, data for a single key can be extracted and checked separately:
df.set_index('Measure').groupby('Key').get_group(4040)
Besides being more readable, this solution is also about 3x faster than the accepted one.
here is one way to do it, revised after the question was updated
i=0
df2=pd.DataFrame(columns=df.columns)
while (i < len(df)):
df2.loc[len(df2)] = ['Value per unit', df.iloc[i,1]] + df.iloc[i+1,2:].divide(df.iloc[i,2:] ).tolist()
i+=2
df2
Measure Key mai/21 jun/21 jul/21 mar/22 apr/22
0 Value per unit 4040 105.0 104.901961 105.769231 110.0 108.823529
1 Value per unit 3230 1000.0 1000.000000 1000.000000 1500.0 1500.000000
I have a dataframe following this pattern:
Measure | Key | mai/21 | jun/21 | jul/21 | … | mar/22 | apr/22 |
---|---|---|---|---|---|---|---|
Units | 4040 | 100 | 102 | 104 | … | 200 | 204 |
Total Value | 4040 | 10500 | 10700 | 11000 | … | 22000 | 22200 |
Units | 3230 | 120 | 130 | 140 | … | 230 | 240 |
Total Value | 3230 | 120000 | 130000 | 140000 | … | 345000 | 360000 |
I need to get a average value per unit and per key, so for every key I have in the dataframe, I need to get the total value row and divide by units row, so I get the average value per unit of each month, and the results needs to follow the same as the input. Just like that:
Measure | Key | mai/21 | jun/21 | jul/21 | … | mar/22 | apr/22 |
---|---|---|---|---|---|---|---|
Value per Units | 4040 | 105 | 104.90 | 105.77 | … | 110 | 108.82 |
Value per Units | 3230 | 1000 | 1000 | 1000 | … | 1500 | 1500 |
How can I do it?
Thanks.
You can do:
df2 = df.set_index('Measure').groupby('Key').apply(lambda x: (x.T['Total Value'] / x.T['Units']).drop('Key')).reset_index()
df2['Measure'] = 'Value per units'
df2[df.columns]
For better understanding what lambda does, data for a single key can be extracted and checked separately:
df.set_index('Measure').groupby('Key').get_group(4040)
Besides being more readable, this solution is also about 3x faster than the accepted one.
here is one way to do it, revised after the question was updated
i=0
df2=pd.DataFrame(columns=df.columns)
while (i < len(df)):
df2.loc[len(df2)] = ['Value per unit', df.iloc[i,1]] + df.iloc[i+1,2:].divide(df.iloc[i,2:] ).tolist()
i+=2
df2
Measure Key mai/21 jun/21 jul/21 mar/22 apr/22
0 Value per unit 4040 105.0 104.901961 105.769231 110.0 108.823529
1 Value per unit 3230 1000.0 1000.000000 1000.000000 1500.0 1500.000000