TypeError: cannot subtract DatetimeArray from ndarray when using time stamp data

Question:

I am trying to calculate the number of days between two columns where each column stored as a TimeStamp object and contain NaN values. When I try to make the calculation, I am receiving TypeError: cannot subtract DatetimeArray from ndarray error. My question is that how I can achieve what I’d like when having NaN values. The best case scenerio for me is that if there is a NaN value, the result should be NaN as well.

import datetime
import pandas as pd

d1 = {'col1':  pd.Timestamp(2017, 1, 1, 12), 'col2' : [np.nan]}
x= pd.DataFrame(d1)

x['col3'] = (x['col2'] - x['col1']).dt.days.astype('int64')

Asked By: sergey_208

||

Answers:

Convert the columns to the correct format: pd.to_datetime.
Use ‘Int64’ instead of ‘int64’.

In general, if you print out the type np.nan, then it will be a float. And if this type suits you, then put the float type.

import pandas as pd
import numpy as np

d1 = {'col1':  [pd.Timestamp(2017, 1, 1, 12)], 'col2' : [np.nan]}
x= pd.DataFrame(d1)
x['col1'] = pd.to_datetime(x['col1'], errors='raise')
x['col2'] = pd.to_datetime(x['col2'], errors='raise')

x['col3'] = (x['col2'] - x['col1']).dt.days.astype('Int64')

print(x)
print(type(np.nan))
Answered By: inquirer
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.