How can I convert these dates to the correct format in a Pandas Dataframe?

Question:

I have a dataframe with some dates and I want to convert them to datetime format. So I used the pd.to_datetime function to do so. However, it only works for some of the dates as the others are not written in the correct order. Example:

df = pd.DataFrame({'dates' : ['December 2021 17', '2005 July 01', 'December 2000 01', '2008 May 11', 
                              'October 2000 04', 'September 2016 04', 'May 1998 09']})

Using pd.to_datetime will only return values for the yy-mm-dd order. I tried splitting these into list and tried to reorder them, but that didn’t seem to work for me.

Asked By: Adam Idris

||

Answers:

You can use apply and give it to_datetime:

df.dates = df.dates.apply(pd.to_datetime)

This is the output of df now:

       dates
0 2021-12-17
1 2005-07-01
2 2000-12-01
3 2008-05-11
4 2000-10-04
5 2016-09-04
6 1998-05-09
Answered By: Marcelo Paco

One option is to extract the year, month and date

y = df['dates'].str.extract(r'(?P<year>bd{4}b)',expand=False) 
d = df['dates'].str.extract(r'(?P<day>bd{2}b)',expand = False) 
m = df['dates'].str.extract(r'(?P<month>b[A-Za-z]+b)',expand = False)

pd.to_datetime(y.str.cat([m,d]),format = '%Y%B%d')

Output:

0   2021-12-17
1   2005-07-01
2   2000-12-01
3   2008-05-11
4   2000-10-04
5   2016-09-04
6   1998-05-09
Answered By: rhug123

If you are not comfortable using apply function (functional programming) as suggested by @Marcelo Paco, you may try this.

Let your dataframe is called date_df. You can convert the dates column to your desired format as follows;

import pandas as pd


date_df['dates'] = pd.to_datetime(date_df['dates'])
date_df

Output:

    dates
0   2021-12-17
1   2005-07-01
2   2000-12-01
3   2008-05-11
4   2000-10-04
5   2016-09-04
6   1998-05-09
Answered By: Ugyen Norbu
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.