Date Time format mixed and separate to two columns and change the format of date
Question:
I have data that contains date and time in a single columns. The format is mixed of date and time. That means, date in some rows and time in some rows in the same column. I have created a simple example to illustrate my problem. The following is the sample data frame:
data = pd.DataFrame ()
data ['Date'] = ['Saturday 20th April 2019','12:30:00','12:30:00','15:00:00']
data ['Name'] = ['A','B','C','D']
I want to do two things.
- I want to separate date and time into two different columns.
- I want to change the format of date to 20-04-2019.
The expected output is as follows:
Date1 and Time are the new columns that I wanted to create.
How can I do that?
Answers:
Use:
data['Date1'] = data['Date'].str.split(n=1).str[1].ffill()
data['Time1'] = data['Date'].str.extract('(d+:d+:d+)', expand=False).bfill()
print (data)
Date Name Date1 Time1
0 Saturday 20th April 2019 A 20th April 2019 12:30:00
1 12:30:00 B 20th April 2019 12:30:00
2 12:30:00 C 20th April 2019 12:30:00
3 15:00:00 D 20th April 2019 15:00:00
One way
data['Date1']=pd.to_datetime(data.Date)
data['Time']=data['Date1'].dt.time
s=data.Date.str.contains(':')
data['Date1']=data['Date1'].mask(s).ffill()
data['Time']=data['Time'].where(s).bfill()
data
Out[1002]:
Date Name Date1 Time
0 Saturday 20th April 2019 A 2019-04-20 12:30:00
1 12:30:00 B 2019-04-20 12:30:00
2 12:30:00 C 2019-04-20 12:30:00
3 15:00:00 D 2019-04-20 15:00:00
I have data that contains date and time in a single columns. The format is mixed of date and time. That means, date in some rows and time in some rows in the same column. I have created a simple example to illustrate my problem. The following is the sample data frame:
data = pd.DataFrame ()
data ['Date'] = ['Saturday 20th April 2019','12:30:00','12:30:00','15:00:00']
data ['Name'] = ['A','B','C','D']
I want to do two things.
- I want to separate date and time into two different columns.
- I want to change the format of date to 20-04-2019.
The expected output is as follows:
Date1 and Time are the new columns that I wanted to create.
How can I do that?
Use:
data['Date1'] = data['Date'].str.split(n=1).str[1].ffill()
data['Time1'] = data['Date'].str.extract('(d+:d+:d+)', expand=False).bfill()
print (data)
Date Name Date1 Time1
0 Saturday 20th April 2019 A 20th April 2019 12:30:00
1 12:30:00 B 20th April 2019 12:30:00
2 12:30:00 C 20th April 2019 12:30:00
3 15:00:00 D 20th April 2019 15:00:00
One way
data['Date1']=pd.to_datetime(data.Date)
data['Time']=data['Date1'].dt.time
s=data.Date.str.contains(':')
data['Date1']=data['Date1'].mask(s).ffill()
data['Time']=data['Time'].where(s).bfill()
data
Out[1002]:
Date Name Date1 Time
0 Saturday 20th April 2019 A 2019-04-20 12:30:00
1 12:30:00 B 2019-04-20 12:30:00
2 12:30:00 C 2019-04-20 12:30:00
3 15:00:00 D 2019-04-20 15:00:00