How to drop a row in Pandas if a third letter in a column is W?

Question:

I have a dataframe of this kind:

ID Jan Feb Mar
20WAST 2 2 5
20S22 0 0 1
20W1ST 2 2 5
200122 0 0 1

And I want to drop all the rows where the third letter in the first column is a ‘W’ to give an output:

ID Jan Feb Mar
20S22 0 0 1
200122 0 0 1

It is a very large dataframe and I tried doing something like this:

df[df.ID[2] != 'W']

But this only selects the item in the second row. I could potentially iterate over the dataframe but wanted to see if there was a better option.

Asked By: Ray234

||

Answers:

You are almost there. Use:

df= df[df['ID'].str[2].ne('W')]

you might want to reset the index after this selection

Answered By: user19077881

You can use regex to find the 3rd character

out = df[df['ID'].str.contains('^.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}(?!W)')]
# or
out = df[df['ID'].str.match('.{2}[^W]')]

NOTE: difference between str.contains and str.match is that str.match match the string from beginning of the target.

$ print(out)

       ID  Jan  Feb  Mar
1   20S22    0    0    1
3  200122    0    0    1
Answered By: Ynjxsjmh
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.