how to count unique values after groupby ID

Question:

I have the following pandas dataframe df

ID  from   to
A    0x    0c
A    0x    0f
A    0f    0n
B    0f    0c
B    0c    0f
C    0k    0j
C    0j    0k
C    0k    0a

First I want to group by id and only keep groups if the number of unique values from from and to combined is less than 3.

so the desired df will be

B    0f    0c
B    0c    0f
C    0k    0j
C    0j    0k
C    0k    0a
Asked By: Olive

||

Answers:

What about using a groupby filter with a lambda function that confirms the number of unique values in the from and to columns is less than or equal to 3? You can use DataFrame.stack() as a hacky solution to put all of the values in a dataframe into a single Series to use Series.nunique() with:

import pandas as pd

# Your dataframe here
df = pd.read_clipboard()

out = df.groupby("ID").filter(lambda x: x[["from", "to"]].stack().nunique() <= 3)

out:

  ID from  to
3  B   0f  0c
4  B   0c  0f
5  C   0k  0j
6  C   0j  0k
7  C   0k  0a
Answered By: Chrysophylaxs
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.