How to modify date value to only keep month value
Question:
I have a data frame, you can have it by running:
import pandas as pd
from io import StringIO
df = """
case_id scheduled_date code
1213 2021-08-17 1
3444 2021-06-24 3
4566 2021-07-20 5
"""
df= pd.read_csv(StringIO(df.strip()), sep='ss+', engine='python')
How can I change scheduled_date
to only keep year and month? The output should be:
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
Answers:
Convert the date to datetime and access the month that way
df['month'] = pd.to_datetime(df['scheduled_date']).dt.to_period('M')
case_id scheduled_date code month
0 1213 2021-08-17 1 2021-08
1 3444 2021-06-24 3 2021-06
2 4566 2021-07-20 5 2021-07
Note that the dtype with be period[M]
and not an object using this method.
You can use string parsing to drop the day of the month (I’m assuming you want strings since the days in the expected output are absent):
df["scheduled_date"].str.split("-").str[:2].str.join("-").astype(str)
This outputs:
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
You can also try this:
df['scheduled_date'] = pd.to_datetime(df.scheduled_date, format='%Y-%m-%d').dt.strftime('%Y-%m')
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
Firstly, convert your string column to datetime column. Later you can apply many different date operations.
For reference: I learnt the answer from this thread – Drop the year from "Year-month-date" format in a pandas dataframe
# converting to datetime:
df['scheduled_date'] = pd.to_datetime(df['scheduled_date'])
# converting the datetime column to desired output
df['scheduled_date'] = df['scheduled_date'].dt.strftime('%y-%m ')
I have a data frame, you can have it by running:
import pandas as pd
from io import StringIO
df = """
case_id scheduled_date code
1213 2021-08-17 1
3444 2021-06-24 3
4566 2021-07-20 5
"""
df= pd.read_csv(StringIO(df.strip()), sep='ss+', engine='python')
How can I change scheduled_date
to only keep year and month? The output should be:
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
Convert the date to datetime and access the month that way
df['month'] = pd.to_datetime(df['scheduled_date']).dt.to_period('M')
case_id scheduled_date code month
0 1213 2021-08-17 1 2021-08
1 3444 2021-06-24 3 2021-06
2 4566 2021-07-20 5 2021-07
Note that the dtype with be period[M]
and not an object using this method.
You can use string parsing to drop the day of the month (I’m assuming you want strings since the days in the expected output are absent):
df["scheduled_date"].str.split("-").str[:2].str.join("-").astype(str)
This outputs:
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
You can also try this:
df['scheduled_date'] = pd.to_datetime(df.scheduled_date, format='%Y-%m-%d').dt.strftime('%Y-%m')
case_id scheduled_date code
0 1213 2021-08 1
1 3444 2021-06 3
2 4566 2021-07 5
Firstly, convert your string column to datetime column. Later you can apply many different date operations.
For reference: I learnt the answer from this thread – Drop the year from "Year-month-date" format in a pandas dataframe
# converting to datetime:
df['scheduled_date'] = pd.to_datetime(df['scheduled_date'])
# converting the datetime column to desired output
df['scheduled_date'] = df['scheduled_date'].dt.strftime('%y-%m ')