How to modify date value to only keep month value

Question:

I have a data frame, you can have it by running:

import pandas as pd
from io import StringIO
    
df = """  
           case_id    scheduled_date        code
           1213       2021-08-17            1
           3444       2021-06-24            3
           4566       2021-07-20            5
          
    """
df= pd.read_csv(StringIO(df.strip()), sep='ss+', engine='python')

How can I change scheduled_date to only keep year and month? The output should be:

  case_id   scheduled_date  code
0   1213    2021-08         1
1   3444    2021-06         3
2   4566    2021-07         5
Asked By: William

||

Answers:

Convert the date to datetime and access the month that way

df['month'] = pd.to_datetime(df['scheduled_date']).dt.to_period('M')

   case_id scheduled_date  code    month
0     1213     2021-08-17     1  2021-08
1     3444     2021-06-24     3  2021-06
2     4566     2021-07-20     5  2021-07

Note that the dtype with be period[M] and not an object using this method.

Answered By: It_is_Chris

You can use string parsing to drop the day of the month (I’m assuming you want strings since the days in the expected output are absent):

df["scheduled_date"].str.split("-").str[:2].str.join("-").astype(str)

This outputs:

   case_id scheduled_date  code
0     1213        2021-08     1
1     3444        2021-06     3
2     4566        2021-07     5
Answered By: BrokenBenchmark

You can also try this:

df['scheduled_date'] = pd.to_datetime(df.scheduled_date, format='%Y-%m-%d').dt.strftime('%Y-%m')


   case_id scheduled_date  code
0     1213        2021-08     1
1     3444        2021-06     3
2     4566        2021-07     5
Answered By: Anoushiravan R

Firstly, convert your string column to datetime column. Later you can apply many different date operations.

For reference: I learnt the answer from this thread – Drop the year from "Year-month-date" format in a pandas dataframe

# converting to datetime: 
df['scheduled_date'] = pd.to_datetime(df['scheduled_date'])

# converting the datetime column to desired output
df['scheduled_date'] = df['scheduled_date'].dt.strftime('%y-%m ')

Sample Output: enter image description here

Answered By: Yogendra Yatnalkar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.