Pandas – Sample all rows of N unique users/ids

Question:

I’m trying to sample 1000 unique users within a data. These can be any 1000 users. But I want to extract all rows for the 1000 unique users.

Input

User_ID Ship Date
A454 8/2/2019
A454 9/2/2019
G658 9/2/2019
G658 9/2/2019
from random import sample
df['User_ID'].sample(n=1000, random_state=1)

I tried the above code, but this just gives the unique IDs and not all rows for 1000 unique users.

Asked By: shockwave

||

Answers:

IIUC, get the unique values, sample and slice with isin and boolean indexing:

from random import sample

out = df[df['User_ID'].isin(random.sample(list(df['User_ID'].unique()), 1000))]
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.