Pandas : Select rows from dataframe having equal value_counts of a specific column
Question:
I have the following dataframe:
df
Place Target
1 A 0
2 B 0
3 C 1
4 B 0
5 F 1
6 Z 0
df['Target'].value_counts()
0 4
1 2
What I want is to a new df with the same value_count for all the column values and I want it to be equal to the minority one (here, it’s 2).
One desired df would look like:
df2
Place Target
1 A 0
2 B 0
3 C 1
5 F 1
df2['Target'].value_counts()
0 2
1 2
Note that the selection (or suppression) process can be done randomly.
Thank you for your help!
Answers:
Here’s a solution using groupby
and head
:
df = pd.DataFrame({'Place': ['A', 'B', 'C', 'B', 'F', 'Z'], 'Target': [0, 0, 1, 0, 1 , 0]})
v_counts = df.Target.value_counts()
minimum = min(v_counts)
df2 = df.groupby('Target').head(minimum)
Output:
Place Target
0 A 0
1 B 0
2 C 1
4 F 1
I have the following dataframe:
df
Place Target
1 A 0
2 B 0
3 C 1
4 B 0
5 F 1
6 Z 0
df['Target'].value_counts()
0 4
1 2
What I want is to a new df with the same value_count for all the column values and I want it to be equal to the minority one (here, it’s 2).
One desired df would look like:
df2
Place Target
1 A 0
2 B 0
3 C 1
5 F 1
df2['Target'].value_counts()
0 2
1 2
Note that the selection (or suppression) process can be done randomly.
Thank you for your help!
Here’s a solution using groupby
and head
:
df = pd.DataFrame({'Place': ['A', 'B', 'C', 'B', 'F', 'Z'], 'Target': [0, 0, 1, 0, 1 , 0]})
v_counts = df.Target.value_counts()
minimum = min(v_counts)
df2 = df.groupby('Target').head(minimum)
Output:
Place Target
0 A 0
1 B 0
2 C 1
4 F 1