Pandas groupby agg – how to get counts?

Question:

I am trying to get sum, mean and count of a metric

df.groupby(['id', 'pushid']).agg({"sess_length": [ np.sum, np.mean, np.count]})

But I get “module ‘numpy’ has no attribute ‘count'”, and I have tried different ways of expressing the count function but can’t get it to work. How do I just an aggregate record count together with the other metrics?

Asked By: L Xandor

||

Answers:

I think you mean :

df.groupby(['id', 'pushid']).agg({"sess_length": [ 'sum', 'count','mean']})

As mentioned in documentation of pandas, you can use string arguments like ‘sum’,’count’. TBH It’s more preferable way of doing these aggregations.

Answered By: lego king

You can use strings instead of the functions, like so:

df = pd.DataFrame(
    {"id": list("ccdef"), "pushid": list("aabbc"), 
     "sess_length": [10, 20, 30, 40, 50]}
)

df.groupby(["id", "pushid"]).agg({"sess_length": ["sum", "mean", "count"]})

Which outputs:

           sess_length
                   sum mean count
 id pushid
 c  a               30   15     2
 d  b               30   30     1
 e  b               40   40     1
 f  c               50   50     1
Answered By: Alex

This might work:

df.groupby(['id', 'pushid']).agg({"sess_length": [ np.sum, np.mean, np.**size**]})
Answered By: Pat Minot

just use np.size

Not sure why the answer needs to be 30 chars long, when the answer is straightforward

Answered By: JustMe
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.