Problem combining Date and Time Column using python pandas
Question:
First of all: I’m sorry if that’s a repost, but I did not found a similiar question.
I have a DataFrame containing one column with values, one with timestamps and one with a corresponding date.
Value | Time | Date
0.00 | 11:08:00 | 30.07.2020
0.01 | 24:00:00 | 30.07.2020
0.02 | 00:01:00 | 31.07.2020
As far as I understood I have to use pd.to_datetime
for column ‘Date’ and pd.to_timedelta
for column ‘Time’.
My problem is: I want to plot this whole DataFrame. Therefore I have to combine both columns into one, named ‘Date_Time’. That worked so far, but problems occurs on rows where the df[‘Time’] is 24:00:00. And I got the Errormessage, that my Time has to be between 0 and 23.
Both columns contain strings so far. I thought about replacing the 24:00:00 by 00:00:00 and so the corresponding date has to change to the next day.
But I don’t know how to do that.
My desired Output should look like:
Value | Date_Time
0.00 | 2020.07.30 11:08:00
0.01 | 2020.07.31 00:00:00
0.02 | 2020.07.31 00:01:00
Thanks in advance.
Answers:
If you want a string, use:
df['Date_Time'] = df.pop('Date')+' '+df.pop('Time')
output:
Value Date_Time
0 0.00 30.07.2020 11:08:00
1 0.01 30.07.2020 24:00:00
2 0.02 31.07.2020 00:01:00
To correctly handle the 24:00 as dateimt:
# drop date/time and concatenate as single string
s = df.pop('Date')+' '+df.pop('Time')
# identify dates with 24:00 format
m = s.str.contains(' 24:')
# convert to datetime and add 1 day
df['Date_Time'] = (pd.to_datetime(s.str.replace(' 24:', ' 00:'))
+ pd.DateOffset(days=1)*m.astype(int)
)
output:
Value Date_Time
0 0.00 2020-07-30 11:08:00
1 0.01 2020-07-31 00:00:00
2 0.02 2020-07-31 00:01:00
Very similar to https://stackoverflow.com/a/17978188/19903280
Essentially, you should be able to do
pd.to_datetime(df['Date'] + df['Time'])
you don’t need the two functions you mentioned in your question, just the one. You can also use the ‘format’ parameter to specify how you want time displayed.
IMHO the cleaner solution will be as follows
df['date_time'] = pd.to_datetime(df.Date) + pd.to_timedelta(df.Time)
First of all: I’m sorry if that’s a repost, but I did not found a similiar question.
I have a DataFrame containing one column with values, one with timestamps and one with a corresponding date.
Value | Time | Date
0.00 | 11:08:00 | 30.07.2020
0.01 | 24:00:00 | 30.07.2020
0.02 | 00:01:00 | 31.07.2020
As far as I understood I have to use pd.to_datetime
for column ‘Date’ and pd.to_timedelta
for column ‘Time’.
My problem is: I want to plot this whole DataFrame. Therefore I have to combine both columns into one, named ‘Date_Time’. That worked so far, but problems occurs on rows where the df[‘Time’] is 24:00:00. And I got the Errormessage, that my Time has to be between 0 and 23.
Both columns contain strings so far. I thought about replacing the 24:00:00 by 00:00:00 and so the corresponding date has to change to the next day.
But I don’t know how to do that.
My desired Output should look like:
Value | Date_Time
0.00 | 2020.07.30 11:08:00
0.01 | 2020.07.31 00:00:00
0.02 | 2020.07.31 00:01:00
Thanks in advance.
If you want a string, use:
df['Date_Time'] = df.pop('Date')+' '+df.pop('Time')
output:
Value Date_Time
0 0.00 30.07.2020 11:08:00
1 0.01 30.07.2020 24:00:00
2 0.02 31.07.2020 00:01:00
To correctly handle the 24:00 as dateimt:
# drop date/time and concatenate as single string
s = df.pop('Date')+' '+df.pop('Time')
# identify dates with 24:00 format
m = s.str.contains(' 24:')
# convert to datetime and add 1 day
df['Date_Time'] = (pd.to_datetime(s.str.replace(' 24:', ' 00:'))
+ pd.DateOffset(days=1)*m.astype(int)
)
output:
Value Date_Time
0 0.00 2020-07-30 11:08:00
1 0.01 2020-07-31 00:00:00
2 0.02 2020-07-31 00:01:00
Very similar to https://stackoverflow.com/a/17978188/19903280
Essentially, you should be able to do
pd.to_datetime(df['Date'] + df['Time'])
you don’t need the two functions you mentioned in your question, just the one. You can also use the ‘format’ parameter to specify how you want time displayed.
IMHO the cleaner solution will be as follows
df['date_time'] = pd.to_datetime(df.Date) + pd.to_timedelta(df.Time)