Pandas – Sample all rows of N unique users/ids
Question:
I’m trying to sample 1000 unique users within a data. These can be any 1000 users. But I want to extract all rows for the 1000 unique users.
Input
User_ID
Ship Date
A454
8/2/2019
A454
9/2/2019
G658
9/2/2019
G658
9/2/2019
from random import sample
df['User_ID'].sample(n=1000, random_state=1)
I tried the above code, but this just gives the unique IDs and not all rows for 1000 unique users.
Answers:
IIUC, get the unique
values, sample
and slice with isin
and boolean indexing:
from random import sample
out = df[df['User_ID'].isin(random.sample(list(df['User_ID'].unique()), 1000))]
I’m trying to sample 1000 unique users within a data. These can be any 1000 users. But I want to extract all rows for the 1000 unique users.
Input
User_ID | Ship Date |
---|---|
A454 | 8/2/2019 |
A454 | 9/2/2019 |
G658 | 9/2/2019 |
G658 | 9/2/2019 |
from random import sample
df['User_ID'].sample(n=1000, random_state=1)
I tried the above code, but this just gives the unique IDs and not all rows for 1000 unique users.
IIUC, get the unique
values, sample
and slice with isin
and boolean indexing:
from random import sample
out = df[df['User_ID'].isin(random.sample(list(df['User_ID'].unique()), 1000))]