Why can't I separate my date-time column properly using pandas in python?
Question:
I have never asked a question here before but I am in need of help. I am trying to separate my date column from my csv file from the form of '12/10/2022 11:45:12.446 +0200'
to a column for date and a column for time.
I have tried what I have found on various sites and by asking ChatGPT but either I get errors, or it works but there are 'NaT'
values filling up both the columns.
This is my current code that works but gives me 'NaT'
values:
import pandas as pd
data = pd.read_csv('file_directory.csv')
print(data['date'].dtype)
data['date'] = pd.to_datetime(data['date'], errors='coerce', utc=True, format='%m/%d/%Y %H:%M:%S.%f %z')
data['date'] = data['date'].dt.date
data['time'] = data['date'].dt.time
data.drop('date', axis=1, inplace=True)`
Can anyone help me fix it and find the cause of the problem?
Thank you!
Answers:
You approach should work (although it is not optimal if you want strings in the column). You have to invert two lines of code and avoid dropping the date:
data = pd.DataFrame({'date': ['12/10/2022 11:45:12.446 +0200']})
data['date'] = pd.to_datetime(data['date'], errors='coerce', utc=True, format='%m/%d/%Y %H:%M:%S.%f %z')
# use this line first else you overwrite "date"
data['time'] = data['date'].dt.time
data['date'] = data['date'].dt.date
Output:
date time
0 2022-12-10 09:45:12.446000
If you want strings:
data = pd.DataFrame({'date': ['12/10/2022 11:45:12.446 +0200']})
data[['date', 'time']] = data['date'].str.split(r' +', n=1, expand=True)
I have never asked a question here before but I am in need of help. I am trying to separate my date column from my csv file from the form of '12/10/2022 11:45:12.446 +0200'
to a column for date and a column for time.
I have tried what I have found on various sites and by asking ChatGPT but either I get errors, or it works but there are 'NaT'
values filling up both the columns.
This is my current code that works but gives me 'NaT'
values:
import pandas as pd
data = pd.read_csv('file_directory.csv')
print(data['date'].dtype)
data['date'] = pd.to_datetime(data['date'], errors='coerce', utc=True, format='%m/%d/%Y %H:%M:%S.%f %z')
data['date'] = data['date'].dt.date
data['time'] = data['date'].dt.time
data.drop('date', axis=1, inplace=True)`
Can anyone help me fix it and find the cause of the problem?
Thank you!
You approach should work (although it is not optimal if you want strings in the column). You have to invert two lines of code and avoid dropping the date:
data = pd.DataFrame({'date': ['12/10/2022 11:45:12.446 +0200']})
data['date'] = pd.to_datetime(data['date'], errors='coerce', utc=True, format='%m/%d/%Y %H:%M:%S.%f %z')
# use this line first else you overwrite "date"
data['time'] = data['date'].dt.time
data['date'] = data['date'].dt.date
Output:
date time
0 2022-12-10 09:45:12.446000
If you want strings:
data = pd.DataFrame({'date': ['12/10/2022 11:45:12.446 +0200']})
data[['date', 'time']] = data['date'].str.split(r' +', n=1, expand=True)