How to aggregate DataFrame to count valus per ID in also with other columns in output Python Pandas?
Question:
I have DataFrame in Python Pandas like below:
ID | COL1 | ... | COL_n
----|------|-----|-------
123 | P | ... | ...
123 | P | ... | ...
123 | M | ... | ...
444 | BM | ... | ...
567 | BM | ... | ...
29 | P | ... | ...
And I need to aggregate above columns so as to have output like below:
-
How many time client bought P -> col: X_P
-
How many time client bought M -> col: X_M
-
How many time client bought in total M + P -> col: X_PM
-
in my real dataset I have many more columns and I need to also have these columns in output
ID
COL1
…
COL_n
X_P
X_M
X_PM
123
P
…
…
2
1
3
123
P
…
…
2
1
3
123
M
…
…
2
1
3
444
BM
…
…
0
0
0
567
BM
…
…
0
0
0
29
P
…
…
1
0
1
How can I do that in Python Pandas ?
Answers:
You could do two steps with pd.crosstab
s = (pd.crosstab(df['ID'],df['COL1'])[['M','P']].
assign(MP=lambda x : x.sum(1)).
reindex(df['ID']).
add_prefix('X_'))
s.index = df.index
df = df.join(s)
Out[178]:
ID COL1 X_M X_P X_MP
0 123 P 1 2 3
1 123 P 1 2 3
2 123 M 1 2 3
3 444 BM 0 0 0
4 567 BM 0 0 0
5 29 P 0 1 1
I have DataFrame in Python Pandas like below:
ID | COL1 | ... | COL_n
----|------|-----|-------
123 | P | ... | ...
123 | P | ... | ...
123 | M | ... | ...
444 | BM | ... | ...
567 | BM | ... | ...
29 | P | ... | ...
And I need to aggregate above columns so as to have output like below:
-
How many time client bought P -> col: X_P
-
How many time client bought M -> col: X_M
-
How many time client bought in total M + P -> col: X_PM
-
in my real dataset I have many more columns and I need to also have these columns in output
ID COL1 … COL_n X_P X_M X_PM 123 P … … 2 1 3 123 P … … 2 1 3 123 M … … 2 1 3 444 BM … … 0 0 0 567 BM … … 0 0 0 29 P … … 1 0 1
How can I do that in Python Pandas ?
You could do two steps with pd.crosstab
s = (pd.crosstab(df['ID'],df['COL1'])[['M','P']].
assign(MP=lambda x : x.sum(1)).
reindex(df['ID']).
add_prefix('X_'))
s.index = df.index
df = df.join(s)
Out[178]:
ID COL1 X_M X_P X_MP
0 123 P 1 2 3
1 123 P 1 2 3
2 123 M 1 2 3
3 444 BM 0 0 0
4 567 BM 0 0 0
5 29 P 0 1 1