Python one pivot table for different type of aggregation
Question:
I have an input df like:
category
name
cost
cost2
Tier1
Tina
10
5
Tier1
Jessy
20
15
Tier1
Tina
30
10
Need output:
category
count distinct name
sum of cost
avg of cost2
Tier1
2
60
10
It is similar to SQL: select category, count(distinct name), sum(cost), avg(cost2) group by category.
I am looking for the equivalent easy way to do it in python, preferably in one statement, if not two. My actual use case has 30 columns so need a way I can simply parse the code.
Answers:
Try:
x = df.groupby("category", as_index=False).agg(
count_distinct_name=("name", "nunique"),
sum_of_cost=("cost", "sum"),
avg_of_cost2=("cost2", "mean"),
)
x.columns = x.columns.str.replace("_", " ")
print(x)
Prints:
category count distinct name sum of cost avg of cost2
0 Tier1 2 60 10.0
I have an input df like:
category | name | cost | cost2 |
---|---|---|---|
Tier1 | Tina | 10 | 5 |
Tier1 | Jessy | 20 | 15 |
Tier1 | Tina | 30 | 10 |
Need output:
category | count distinct name | sum of cost | avg of cost2 |
---|---|---|---|
Tier1 | 2 | 60 | 10 |
It is similar to SQL: select category, count(distinct name), sum(cost), avg(cost2) group by category.
I am looking for the equivalent easy way to do it in python, preferably in one statement, if not two. My actual use case has 30 columns so need a way I can simply parse the code.
Try:
x = df.groupby("category", as_index=False).agg(
count_distinct_name=("name", "nunique"),
sum_of_cost=("cost", "sum"),
avg_of_cost2=("cost2", "mean"),
)
x.columns = x.columns.str.replace("_", " ")
print(x)
Prints:
category count distinct name sum of cost avg of cost2
0 Tier1 2 60 10.0