Check the presence of a number after nth digit and then filter your dataframe in Python
Question:
My data consists of a date column, which looks like –
Date
20221201
20221202
20221203
20221101
20221102
I need to filter the dataframe only for the month of december, I tried something like –
df = df[df['Date'][4:6] == 12]
its not working
Answers:
I would convert strings into actual dates, then extract month and check if it is equal to desired following way
import pandas as pd
df = pd.DataFrame({"Date":["20221201","20221202","20221203","20221101","20221102"]})
df["Date_object"] = pd.to_datetime(df["Date"],format="%Y%m%d")
df2 = df[df["Date_object"].dt.month==12]
print(df2)
gives output
Date Date_object
0 20221201 2022-12-01
1 20221202 2022-12-02
2 20221203 2022-12-03
See @Daweo’s answer for the clean way to do this.
If you actually want to ‘slice’ an integer (not recommended with dates), you can use a floor division (//
) in combination with the modulus (%
).
df = df[(df['Date'] // 100) % 100 == 12]
Explanation:
// 100
‘removes’ the last two digits while % 100
only keeps the last two digits.
20221201 // 100 = 202212
202212 % 100 = 12
My data consists of a date column, which looks like –
Date |
---|
20221201 |
20221202 |
20221203 |
20221101 |
20221102 |
I need to filter the dataframe only for the month of december, I tried something like –
df = df[df['Date'][4:6] == 12]
its not working
I would convert strings into actual dates, then extract month and check if it is equal to desired following way
import pandas as pd
df = pd.DataFrame({"Date":["20221201","20221202","20221203","20221101","20221102"]})
df["Date_object"] = pd.to_datetime(df["Date"],format="%Y%m%d")
df2 = df[df["Date_object"].dt.month==12]
print(df2)
gives output
Date Date_object
0 20221201 2022-12-01
1 20221202 2022-12-02
2 20221203 2022-12-03
See @Daweo’s answer for the clean way to do this.
If you actually want to ‘slice’ an integer (not recommended with dates), you can use a floor division (//
) in combination with the modulus (%
).
df = df[(df['Date'] // 100) % 100 == 12]
Explanation:
// 100
‘removes’ the last two digits while % 100
only keeps the last two digits.
20221201 // 100 = 202212
202212 % 100 = 12