Drop rows that contains the same value in pandas DataFrame
Question:
I’m currently working on a data frame like the one below:
artist
week1
week2
week3
week4
Drake
2
2
3
1
Muse
NA
NA
NA
NA
Bruno Mars
3
3
4
2
Imagine Dragons
NA
NA
NA
NA
Justin Timberlake
2
2
NA
1
What I want to do is to drop the rows that only contain "NA" values. The result should be something like this:
artist
week1
week2
week3
week4
Drake
2
2
3
1
Bruno Mars
3
3
4
2
Justin Timberlake
2
2
NA
1
I’ve tried using the pandas drop()
function but drops every row with at least one "NA" value. In that case, the row for Justin Timberlake would be dropped but that’s not what I need.
Answers:
Use df.dropna()
and set how='all'
meaning If all values are NA
, drop that row or column. then set the subset
columns.
df = df.dropna(how='all', subset=['week1', 'week2', 'week3', 'week4'])
print(df)
Or Keep only the rows with at least 2 non-NA values.
df = df.dropna(thresh=2)
print(df)
artist week1 week2 week3 week4
0 Drake 2.0 2.0 3.0 1.0
2 Bruno Mars 3.0 3.0 4.0 2.0
4 Justin Timberlake 2.0 2.0 NaN 1.0
I’m currently working on a data frame like the one below:
artist | week1 | week2 | week3 | week4 |
---|---|---|---|---|
Drake | 2 | 2 | 3 | 1 |
Muse | NA | NA | NA | NA |
Bruno Mars | 3 | 3 | 4 | 2 |
Imagine Dragons | NA | NA | NA | NA |
Justin Timberlake | 2 | 2 | NA | 1 |
What I want to do is to drop the rows that only contain "NA" values. The result should be something like this:
artist | week1 | week2 | week3 | week4 |
---|---|---|---|---|
Drake | 2 | 2 | 3 | 1 |
Bruno Mars | 3 | 3 | 4 | 2 |
Justin Timberlake | 2 | 2 | NA | 1 |
I’ve tried using the pandas drop()
function but drops every row with at least one "NA" value. In that case, the row for Justin Timberlake would be dropped but that’s not what I need.
Use df.dropna()
and set how='all'
meaning If all values are NA
, drop that row or column. then set the subset
columns.
df = df.dropna(how='all', subset=['week1', 'week2', 'week3', 'week4'])
print(df)
Or Keep only the rows with at least 2 non-NA values.
df = df.dropna(thresh=2)
print(df)
artist week1 week2 week3 week4
0 Drake 2.0 2.0 3.0 1.0
2 Bruno Mars 3.0 3.0 4.0 2.0
4 Justin Timberlake 2.0 2.0 NaN 1.0