dataframe: column statistics after grouping

Question:

I know this question has been asked, but I can’t the answer worked for me, so I ask.
Having this dataframe:

data = pd.DataFrame({'id': [9,9,9,2,2,7,7,7,5,8,8,4,4,3,3,3,
        1,1,1,1,1,6,6,6,6,6,10,11,11,11,11],
 'signal': [1,3,5,7,9,13,17,27,5,1,5,5,11,3,7,11,6,
            8,12,14,18,1,3,5,111,115,57,9,21,45,51],
 'mode': [0,0,0,0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,2,
          2,2,3,3,3,3,3,3,3,3,3,3]
 })

 data.head()
    id  signal  mode
0   9     1      0
1   9     3      0
2   9     5      0
3   2     7      0
4   2     9      0

I want to get the distribution of signal by mode.
I know I need to do the grouping this way:

data.groupby('signal')['mode'].value_counts()

But then I don’t know how to proceed, to arrive at:

mode    total
0         8
1         5
2         8
3        10
Asked By: Amina Umar

||

Answers:

Simple use value_counts on mode columns:

>>> df['mode'].value_counts()
3    10
0     8
2     8
1     5
Name: mode, dtype: int64

# Fully output
>>> (df['mode'].value_counts().rename('total').rename_axis('mode')
               .sort_index().reset_index())
   mode  total
0     0      8
1     1      5
2     2      8
3     3     10

But you can also want:

>>> df.groupby('mode', as_index=False)['signal'].nunique()
   mode  signal
0     0       8
1     1       3
2     2       8
3     3      10
Answered By: Corralien
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.