Give default datetime object value to pandas.to_datetime()
Question:
I have some dates in string with different formats that I convert to datetime objects using to_datetime()
. However, the list of strings also has some garbage values that I want to convert to default date.
import pandas as pd
import datetime as dt
print(df)
dates
0 2018-02-12
1 2018-03-19
2 12-24-2018
3 garbage
I use errors='coerece'
to avert to throw exception. It produces NaT
, that I want to convert to a default date 2018-12-31, in my case.
df['dates'] = pd.to_datetime(df['dates'], errors='coerce')
Below result.
dates
0 2018-02-12
1 2018-03-19
2 2018-12-24
3 NaT
Approach:
I am checking if the given value is a valid datetime or not. If not, put the default datetime object. But for some reason, it produces all default values.
df['dates'].apply(lambda x: dt.datetime(2018,12,31) if x is not dt.datetime else x)
Current Output
dates
0 2018-12-31
1 2018-12-31
2 2018-12-31
3 2018-12-31
Expected Output:
dates
0 2018-02-12
1 2018-03-19
2 2018-12-24
3 2018-12-31
Is there a way to give a default date to to_datetime() function so that, it won’t produce NaT? If not, how do I put default dates afterwards?
Answers:
You just need adding fillna
at the end after pd.to_datetime
call
pd.to_datetime(df['dates'], errors='coerce').fillna(pd.to_datetime('2018-12-31'))
Out[217]:
0 2018-02-12
1 2018-03-19
2 2018-12-24
3 2018-12-31
Name: dates, dtype: datetime64[ns]
I have some dates in string with different formats that I convert to datetime objects using to_datetime()
. However, the list of strings also has some garbage values that I want to convert to default date.
import pandas as pd
import datetime as dt
print(df)
dates
0 2018-02-12
1 2018-03-19
2 12-24-2018
3 garbage
I use errors='coerece'
to avert to throw exception. It produces NaT
, that I want to convert to a default date 2018-12-31, in my case.
df['dates'] = pd.to_datetime(df['dates'], errors='coerce')
Below result.
dates
0 2018-02-12
1 2018-03-19
2 2018-12-24
3 NaT
Approach:
I am checking if the given value is a valid datetime or not. If not, put the default datetime object. But for some reason, it produces all default values.
df['dates'].apply(lambda x: dt.datetime(2018,12,31) if x is not dt.datetime else x)
Current Output
dates
0 2018-12-31
1 2018-12-31
2 2018-12-31
3 2018-12-31
Expected Output:
dates
0 2018-02-12
1 2018-03-19
2 2018-12-24
3 2018-12-31
Is there a way to give a default date to to_datetime() function so that, it won’t produce NaT? If not, how do I put default dates afterwards?
You just need adding fillna
at the end after pd.to_datetime
call
pd.to_datetime(df['dates'], errors='coerce').fillna(pd.to_datetime('2018-12-31'))
Out[217]:
0 2018-02-12
1 2018-03-19
2 2018-12-24
3 2018-12-31
Name: dates, dtype: datetime64[ns]