How to drop a row in Pandas if a third letter in a column is W?
Question:
I have a dataframe of this kind:
ID
Jan
Feb
Mar
20WAST
2
2
5
20S22
0
0
1
20W1ST
2
2
5
200122
0
0
1
And I want to drop all the rows where the third letter in the first column is a ‘W’ to give an output:
ID
Jan
Feb
Mar
20S22
0
0
1
200122
0
0
1
It is a very large dataframe and I tried doing something like this:
df[df.ID[2] != 'W']
But this only selects the item in the second row. I could potentially iterate over the dataframe but wanted to see if there was a better option.
Answers:
You are almost there. Use:
df= df[df['ID'].str[2].ne('W')]
you might want to reset the index after this selection
You can use regex to find the 3rd character
out = df[df['ID'].str.contains('^.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}[^W]')]
NOTE: difference between str.contains
and str.match
is that str.match
match the string from beginning of the target.
$ print(out)
ID Jan Feb Mar
1 20S22 0 0 1
3 200122 0 0 1
I have a dataframe of this kind:
ID | Jan | Feb | Mar |
---|---|---|---|
20WAST | 2 | 2 | 5 |
20S22 | 0 | 0 | 1 |
20W1ST | 2 | 2 | 5 |
200122 | 0 | 0 | 1 |
And I want to drop all the rows where the third letter in the first column is a ‘W’ to give an output:
ID | Jan | Feb | Mar |
---|---|---|---|
20S22 | 0 | 0 | 1 |
200122 | 0 | 0 | 1 |
It is a very large dataframe and I tried doing something like this:
df[df.ID[2] != 'W']
But this only selects the item in the second row. I could potentially iterate over the dataframe but wanted to see if there was a better option.
You are almost there. Use:
df= df[df['ID'].str[2].ne('W')]
you might want to reset the index after this selection
You can use regex to find the 3rd character
out = df[df['ID'].str.contains('^.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}[^W]')]
NOTE: difference between str.contains
and str.match
is that str.match
match the string from beginning of the target.
$ print(out)
ID Jan Feb Mar
1 20S22 0 0 1
3 200122 0 0 1