Pandas convert column with year integer to datetime
Question:
I am having some problem converting column (datatype:int64) into datetime working with Pandas.
Original data:
Year
2015
2014
...
2010
Desired outcome:
Year
2015-01-01
2014-01-01
...
2010-01-01
My current result:
Year
1970-01-01 00:00:00.000002015
1970-01-01 00:00:00.000002014
...
1970-01-01 00:00:00.000002010
I have tried:
data.Year = pd.to_datetime(data.Year)
data.Year = pd.to_datetime(data.Year, format='%Y-%m-%d')
Answers:
Use format='%Y'
In [225]: pd.to_datetime(df.Year, format='%Y')
Out[225]:
0 2015-01-01
1 2014-01-01
2 2010-01-01
Name: Year, dtype: datetime64[ns]
Details
In [226]: df
Out[226]:
Year
0 2015
1 2014
2 2010
I know this an old question but, there’s a catch when converting int to datetime, when the type of the data is int64 it will result in wrong parsing. I had the same situation when trying to convert a list of Years as int64, it would result into:
pd.to_datetime(df.Year, format='%Y')
Year
1970-01-01 00:00:00.000002015
1970-01-01 00:00:00.000002014
...
1970-01-01 00:00:00.000002010
To avoid this, you need to convert int64 to int32
df.Year.astype('int32')
. Then you can parse it as pd.to_datetime(df.Year, format = '%Y')
and you will get the correct output.
2015
2014
...
2010
I faced similar issue, and in my case pd.to_datetime(df.Year, format='%Y')
,
this worked but not completely. Instead I had to use .year
at the end of dataframe and voilĂ ! that worked fine.
I am having some problem converting column (datatype:int64) into datetime working with Pandas.
Original data:
Year
2015
2014
...
2010
Desired outcome:
Year
2015-01-01
2014-01-01
...
2010-01-01
My current result:
Year
1970-01-01 00:00:00.000002015
1970-01-01 00:00:00.000002014
...
1970-01-01 00:00:00.000002010
I have tried:
data.Year = pd.to_datetime(data.Year)
data.Year = pd.to_datetime(data.Year, format='%Y-%m-%d')
Use format='%Y'
In [225]: pd.to_datetime(df.Year, format='%Y')
Out[225]:
0 2015-01-01
1 2014-01-01
2 2010-01-01
Name: Year, dtype: datetime64[ns]
Details
In [226]: df
Out[226]:
Year
0 2015
1 2014
2 2010
I know this an old question but, there’s a catch when converting int to datetime, when the type of the data is int64 it will result in wrong parsing. I had the same situation when trying to convert a list of Years as int64, it would result into:
pd.to_datetime(df.Year, format='%Y')
Year
1970-01-01 00:00:00.000002015
1970-01-01 00:00:00.000002014
...
1970-01-01 00:00:00.000002010
To avoid this, you need to convert int64 to int32
df.Year.astype('int32')
. Then you can parse it as pd.to_datetime(df.Year, format = '%Y')
and you will get the correct output.
2015
2014
...
2010
I faced similar issue, and in my case pd.to_datetime(df.Year, format='%Y')
,
this worked but not completely. Instead I had to use .year
at the end of dataframe and voilĂ ! that worked fine.