AttributeError: 'DataFrame' object has no attribute 'to_datetime'

Question:

I want to convert all the items in the ‘Time’ column of my pandas dataframe from UTC to Eastern time. However, following the answer in this stackoverflow post, some of the keywords are not known in pandas 0.20.3. Overall, how should I do this task?

tweets_df = pd.read_csv('valid_tweets.csv')

tweets_df['Time'] = tweets_df.to_datetime(tweets_df['Time'])
tweets_df.set_index('Time', drop=False, inplace=True)

error is:

tweets_df['Time'] = tweets_df.to_datetime(tweets_df['Time'])
  File "/scratch/sjn/anaconda/lib/python3.6/site-packages/pandas/core/generic.py", line 3081, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'to_datetime'

items from the Time column look like this:

2016-10-20 03:43:11+00:00

Update: Using

tweets_df['Time'] = pd.to_datetime(tweets_df['Time'])
tweets_df.set_index('Time', drop=False, inplace=True)
tweets_df.index = tweets_df.index.tz_localize('UTC').tz_convert('US/Eastern') 

did no time conversion. Any idea what could be fixed?

Update 2:
So the following code, does not do in-place conversion meaning when I print the row['Time'] using iterrows(), it shows the original values. Do you know how to do the in-place conversion?

tweets_df['Time'] = pd.to_datetime(tweets_df['Time'])
for index, row in tweets_df.iterrows():
    row['Time'].tz_localize('UTC').tz_convert('US/Eastern')
for index, row in tweets_df.iterrows():
    print(row['Time'])
Asked By: Mona Jalal

||

Answers:

to_datetime is a function defined in pandas not a method on a DataFrame. Try:

tweets_df['Time'] = pd.to_datetime(tweets_df['Time'])
Answered By: Alex

to_datetime is a general function that doesn’t have an equivalent DataFrame method. That said, you can call it using apply on a single column dataframe.

tweets_df['Time'] = tweets_df[['Time']].apply(pd.to_datetime)

apply is especially useful if multiple columns need to be converted into datetime64.

It’s also possible to apply it on a column but it’s not really advisable since now it becomes a loop over a column which will be very slow for large frames.

tweets_df['Time'] = tweets_df['Time'].apply(pd.to_datetime)
#                            ^      ^  <--- single brackets

PSA: Passing format= makes the conversion run much, much faster. See this post for more info.

Answered By: cottontail