split python dataframe with certain number of observations

Question:

Here’s the simple version of data frame that I have:

customer_ID value_1 value_2 ....
1            0.5    0.2
1            ...    ...
1
1
2
2
3
3
3
....

Suppose I have 1000 unique customers in the above data frame and only want to get a sample of data frame with 100 customers in it. The customer_ID is random, and I don’t know who’s the 100th customer, which means I cannot just assign customers with customer_ID <= 100 into one data frame. How should I do it?

Thanks!

Asked By: lily_kim

||

Answers:

  1. you can take all the customers_ids to a list:

unique_ID=df.customer_ID.unique()

  1. then choose randomly 100 of them to another list

import random

random_ID = random.sample(unique_ID, 100)

  1. and finally filter your dataframe with that list

df[df[‘customer_ID’].isin(random_ID)]

hope it helps

Answered By: Enrique Martin
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.