Filter on a pandas string column as numeric without creating a new column

Question:

This is a quite easy task, however, I am stuck here. I have a dataframe and there is a column with type string, so characters in it:

Category
AB00
CD01
EF02
GH03
RF04

Now I want to treat these values as numeric and filter on and create a subset dataframe. However, I do not want to change the dataframe in any way. I tried:

df_subset=df[df['Category'].str[2:4]<=3]

of course this does not work, as the first part is a string and cannot be evaluated as numeric and compared to 69.

I tried

df_subset=df[int(df['Category'].str[2:4])<=3]

but I am not sure about this, I think it is wrong or not the way it should be done.

Asked By: PSt

||

Answers:

Add type conversion to your expression:

df[df['Category'].str[2:].astype(int) <= 3]

  Category
0     AB00
1     CD01
2     EF02
3     GH03
Answered By: RomanPerekhrest

As you have leading zeros, you can directly use string comparison:

df_subset = df.loc[df['Category'].str[2:4] <= '03']

Output:

  Category
0     AB00
1     CD01
2     EF02
3     GH03
Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.