Convert Integer into Datetime in Python

Question:

I’m working on a df that gives me Date in a integer form, and i want to convert it into Datetime so i can continue to manipulate it.

Basically i have this DF where i have a Column like this below

    DATE
18010101
18010101
18010101
18010101
18010101
... ... ... ... ... ...
20123124
20123124
20123124
20123124
20123124

the respective order in this dateset is (Year, month, day, hour)

i’ve already tried to do something like this

df["Year"] = df.DATE[0:2]
df["Month"] = df.DATE[2:4]~

but it converts it into a float and for the first line example, it becomes similar to this

DATE    Year
18010101    18010101.0

where it was supposed to be

Year = 2018 or 18

I would appreciate a lot the help of you guys.

Asked By: Carlos Juvenal

||

Answers:

Use Series.str.extract and a regex

df = pd.DataFrame([[18010101], [18010101], [18010101], [18010101], [18010101],
                   [20123124], [20123124], [20123124], [20123124], [20123124]],
                  columns=['DATE'])
df[['year', 'month', 'day', 'hour']] = df['DATE'].astype(str).str.extract(r'(d{2})(d{2})(d{2})(d{2})')

DATE year month day hour
18010101 18 01 01 01
18010101 18 01 01 01
18010101 18 01 01 01
18010101 18 01 01 01
18010101 18 01 01 01
20123124 20 12 31 24
20123124 20 12 31 24
20123124 20 12 31 24
20123124 20 12 31 24
20123124 20 12 31 24
Answered By: azro

Earlier, you tried to split an int like a string.
But you can first convert it into a string (by .astype(str)) and then split it. Check the lines below.

date_str = df.DATE.astype(str)
df["Year"] = date_str.str[0:2]
df["Month"] = date_str.str[2:4]
Answered By: Jitesh

Are your hour stamps actually from 1 to 24? Normal convention for 24-hour time is 0-23 where you can use:

df = pd.DataFrame([[18010100], [18010100], [18010100], [18010100], [18010100],
                   [20123123], [20123123], [20123123], [20123123], [20123123]],
                  columns=['DATE'])

pd.to_datetime(df['DATE'], format='%y%m%d%H')

Or simply take 1 from each number beforehand like so:

df = pd.DataFrame([[18010101], [18010101], [18010101], [18010101], [18010101],
                   [20123124], [20123124], [20123124], [20123124], [20123124]],
                  columns=['DATE'])
pd.to_datetime(df['DATE']-1, format='%y%m%d%H')```
Answered By: Matt Rosinski
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.