Problem combining Date and Time Column using python pandas

Question:

First of all: I’m sorry if that’s a repost, but I did not found a similiar question.

I have a DataFrame containing one column with values, one with timestamps and one with a corresponding date.

Value | Time     | Date
0.00  | 11:08:00 | 30.07.2020
0.01  | 24:00:00 | 30.07.2020
0.02  | 00:01:00 | 31.07.2020

As far as I understood I have to use pd.to_datetime for column ‘Date’ and pd.to_timedelta for column ‘Time’.

My problem is: I want to plot this whole DataFrame. Therefore I have to combine both columns into one, named ‘Date_Time’. That worked so far, but problems occurs on rows where the df[‘Time’] is 24:00:00. And I got the Errormessage, that my Time has to be between 0 and 23.

Both columns contain strings so far. I thought about replacing the 24:00:00 by 00:00:00 and so the corresponding date has to change to the next day.
But I don’t know how to do that.

My desired Output should look like:

Value | Date_Time
0.00  | 2020.07.30 11:08:00
0.01  | 2020.07.31 00:00:00
0.02  | 2020.07.31 00:01:00

Thanks in advance.

Asked By: Grinorye

||

Answers:

If you want a string, use:

df['Date_Time'] = df.pop('Date')+' '+df.pop('Time')

output:

   Value            Date_Time
0   0.00  30.07.2020 11:08:00
1   0.01  30.07.2020 24:00:00
2   0.02  31.07.2020 00:01:00

To correctly handle the 24:00 as dateimt:

# drop date/time and concatenate as single string
s = df.pop('Date')+' '+df.pop('Time')

# identify dates with 24:00 format
m = s.str.contains(' 24:')

# convert to datetime and add 1 day
df['Date_Time'] = (pd.to_datetime(s.str.replace(' 24:', ' 00:'))
                   + pd.DateOffset(days=1)*m.astype(int)
                   )

output:

   Value           Date_Time
0   0.00 2020-07-30 11:08:00
1   0.01 2020-07-31 00:00:00
2   0.02 2020-07-31 00:01:00
Answered By: mozway

Very similar to https://stackoverflow.com/a/17978188/19903280

Essentially, you should be able to do

pd.to_datetime(df['Date'] + df['Time'])

you don’t need the two functions you mentioned in your question, just the one. You can also use the ‘format’ parameter to specify how you want time displayed.

Answered By: Aoife Hughes

IMHO the cleaner solution will be as follows

df['date_time'] = pd.to_datetime(df.Date) + pd.to_timedelta(df.Time)
Answered By: sayan dasgupta
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.