remove timezone from timestamp column of pandas dataframe

Question:

Data loaded with column as pandas date time:

df = pd.read_csv('test.csv', parse_dates=['timestamp'])
df


   user         timestamp           speed
0   2   2016-04-01 01:06:26+01:00   9.76
1   2   2016-04-01 01:06:26+01:00   5.27
2   2   2016-04-01 01:06:26+01:00   8.12
3   2   2016-04-01 01:07:53+01:00   8.81

I want to remove time zone information from timestamp column:


df['timestamp'].tz_convert(None)

TypeError: index is not a valid DatetimeIndex or PeriodIndex
Asked By: Amina Umar

||

Answers:

For this solution to work the column should be datetime

df['timestamp'].dt.tz_localize(None)
Answered By: Shubham Srivastava

Given strings in your csv like "2016-04-01 01:06:26+01:00", I can think of the following options:

import pandas as pd

# will only work if *all* your timestamp contain "+hh:mm"
df = pd.read_csv('test.csv', parse_dates=['timestamp'])
df['timestamp'] = df.timestamp.dt.tz_localize(None)

print(df.timestamp.dtype)
datetime64[ns]

df = pd.read_csv('test.csv')
df['timestamp'] = pd.to_datetime(df.timestamp.str.split('+', expand=True)[0])

print(df.timestamp.dtype)
datetime64[ns]

df = pd.read_csv('test.csv', parse_dates=['timestamp'],
                 date_parser=lambda x: pd.to_datetime(x.split('+')[0]))

print(df.timestamp.dtype)
datetime64[ns]
Answered By: ouroboros1
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.