Calculating subtotals in pandas pivot_table with MultiIndex
Question:
I have the following raw data, in a dataframe:
BROKER VENUE QUANTITY
0 BrokerA Venue_1 300
1 BrokerA Venue_2 400
2 BrokerA Venue_2 1400
3 BrokerA Venue_3 800
4 BrokerB Venue_2 500
5 BrokerB Venue_3 1100
6 BrokerC Venue_1 1000
7 BrokerC Venue_1 1200
8 BrokerC Venue_2 17000
I want to do some summarization of the data to see how much each broker sent to each venue, so I created a pivot_table:
pt = df.pivot_table(index=['BROKER', 'VENUE'], values=['QUANTITY'], aggfunc=np.sum)
Result, as expected:
QUANTITY
BROKER VENUE
BrokerA Venue_1 300.0
Venue_2 1800.0
Venue_3 800.0
BrokerB Venue_2 500.0
Venue_3 1100.0
BrokerC Venue_1 2200.0
Venue_2 17000.0
I also want how much was sent by each broker overall. and show it in this same table. I can get that information by typing df.groupby('BROKER').sum()
, but how can I add this to my pivot table as a column named, say, BROKER_TOTAL?
Note: This question is similar but seems to be on an older version, and my best guess at adapting it to my situation didn’t work: Pandas Pivot tables row subtotals
Answers:
You can create MultiIndex.from_arrays
for df1
, concat
it to pt
and last sort_index
:
df1 = df.groupby('BROKER').sum()
df1.index = pd.MultiIndex.from_arrays([df1.index + '_total', len(df1.index) * ['']])
print (df1)
QUANTITY
BrokerA_total 2900
BrokerB_total 1600
BrokerC_total 19200
print (pd.concat([pt, df1]).sort_index())
QUANTITY
BROKER VENUE
BrokerA Venue_1 300
Venue_2 1800
Venue_3 800
BrokerA_total 2900
BrokerB Venue_2 500
Venue_3 1100
BrokerB_total 1600
BrokerC Venue_1 2200
Venue_2 17000
BrokerC_total 19200
I have the following raw data, in a dataframe:
BROKER VENUE QUANTITY
0 BrokerA Venue_1 300
1 BrokerA Venue_2 400
2 BrokerA Venue_2 1400
3 BrokerA Venue_3 800
4 BrokerB Venue_2 500
5 BrokerB Venue_3 1100
6 BrokerC Venue_1 1000
7 BrokerC Venue_1 1200
8 BrokerC Venue_2 17000
I want to do some summarization of the data to see how much each broker sent to each venue, so I created a pivot_table:
pt = df.pivot_table(index=['BROKER', 'VENUE'], values=['QUANTITY'], aggfunc=np.sum)
Result, as expected:
QUANTITY
BROKER VENUE
BrokerA Venue_1 300.0
Venue_2 1800.0
Venue_3 800.0
BrokerB Venue_2 500.0
Venue_3 1100.0
BrokerC Venue_1 2200.0
Venue_2 17000.0
I also want how much was sent by each broker overall. and show it in this same table. I can get that information by typing df.groupby('BROKER').sum()
, but how can I add this to my pivot table as a column named, say, BROKER_TOTAL?
Note: This question is similar but seems to be on an older version, and my best guess at adapting it to my situation didn’t work: Pandas Pivot tables row subtotals
You can create MultiIndex.from_arrays
for df1
, concat
it to pt
and last sort_index
:
df1 = df.groupby('BROKER').sum()
df1.index = pd.MultiIndex.from_arrays([df1.index + '_total', len(df1.index) * ['']])
print (df1)
QUANTITY
BrokerA_total 2900
BrokerB_total 1600
BrokerC_total 19200
print (pd.concat([pt, df1]).sort_index())
QUANTITY
BROKER VENUE
BrokerA Venue_1 300
Venue_2 1800
Venue_3 800
BrokerA_total 2900
BrokerB Venue_2 500
Venue_3 1100
BrokerB_total 1600
BrokerC Venue_1 2200
Venue_2 17000
BrokerC_total 19200