Using Pandas, i'm trying to keep on my DataFrame only 100 rows of each value of my column "neighborhood"

Question

I have a super large dataset that i’m trying to shrink.
My idea is to keep 100 rows by neighborhood.

Here’s an overview of my data :

What is the more efficient way to do so ?

Thanks in advance

I’m expecting to create something that looks like :

Asked By: Julien8

||

Answer 1

i think, you can use groupby and *nth:

dfx=df.groupby('neighborhood').nth[:100]

Answered By: Clegane

Answer 2

It depends how you want to select the rows.

n = 100
out = df.groupby('neighborhood').head(n)

n = 100
out = df.groupby('neighborhood').sample(n=n)

Answered By: mozway

Question: