Filtering a dataframe by particular data
Question:
I have the below code that works if I want to filter my original dataframe based on an entire project number value, what I would like to do is update the code so that it only looks at the last number in the project number field, and filters based on that. So df_0 would return be a new data frame that filtered the original to only show rows where the project number ended in 0. I’m struggling with the syntax of referencing only the last digit vs. the entire number.
import pandas as pd
df = pd.read_excel(r'C:UsersXXXXXX.xlsx')
projects_df = df["Project Number"].drop_duplicates()
df_0 = df[df["Project Number"] == XXXXXXX]
So something like this, though this doens’t work:
df_0 = df[df["Project Number"][-1] == 0]
Answers:
You could use df.astype()
method
df_0 = df[df["Project Number"].astype(str).str[-1] == "0"]
Use modulo division by 10
to recognize numbers that end with 0
:
df_0 = df[(df["Project Number"] % 10) == 0]
I have the below code that works if I want to filter my original dataframe based on an entire project number value, what I would like to do is update the code so that it only looks at the last number in the project number field, and filters based on that. So df_0 would return be a new data frame that filtered the original to only show rows where the project number ended in 0. I’m struggling with the syntax of referencing only the last digit vs. the entire number.
import pandas as pd
df = pd.read_excel(r'C:UsersXXXXXX.xlsx')
projects_df = df["Project Number"].drop_duplicates()
df_0 = df[df["Project Number"] == XXXXXXX]
So something like this, though this doens’t work:
df_0 = df[df["Project Number"][-1] == 0]
You could use df.astype()
method
df_0 = df[df["Project Number"].astype(str).str[-1] == "0"]
Use modulo division by 10
to recognize numbers that end with 0
:
df_0 = df[(df["Project Number"] % 10) == 0]