Trouble shooting pd.to_timedelta calculation failure
Question:
I have previously used the following conditional statement successfully
for RxDat in df2:
condition = (df['Tdate'] > RxDat - pd.to_timedelta(46, unit="D")) & (df['Tdate'] < RxDat)
Now I am getting the following error
TypeError: unsupported operand type(s) for -: ‘str’ and ‘Timedelta’
I have extracted the following data to illustrate the error
df['Tdate']
contains
[Timestamp('2004-08-25 00:00:00'), Timestamp('2004-10-13 00:00:00'), Timestamp('2004-12-13 00:00:00'), Timestamp('2005-02-21 00:00:00'), Timestamp('2005-04-28 00:00:00'), Timestamp('2005-08-24 00:00:00')]
df2['RxDate']
contains
[Timestamp('2004-08-20 00:00:00'), Timestamp('2004-08-23 00:00:00'), Timestamp('2004-08-18 00:00:00'), Timestamp('2004-08-15 00:00:00'), Timestamp('2004-08-12 00:00:00'), Timestamp('2004-08-13 00:00:00')]
I have tried looking at this a few ways and cannot see why I get the error?
Answers:
If loop by d2
then RxDat
are columns names:
for RxDat in df2:
Use:
for RxDat in df2['RxDate']:
Non loop solution with broadcasting, output is 2d numpy array:
a = df['Tdate'].to_numpy()[:, None]
b = df2['RxDate'].sub(pd.to_timedelta(46, unit="D")).to_numpy()
c = df2['RxDate'].to_numpy()
condition = (a > b) & (a < c)
I have previously used the following conditional statement successfully
for RxDat in df2:
condition = (df['Tdate'] > RxDat - pd.to_timedelta(46, unit="D")) & (df['Tdate'] < RxDat)
Now I am getting the following error
TypeError: unsupported operand type(s) for -: ‘str’ and ‘Timedelta’
I have extracted the following data to illustrate the error
df['Tdate']
contains
[Timestamp('2004-08-25 00:00:00'), Timestamp('2004-10-13 00:00:00'), Timestamp('2004-12-13 00:00:00'), Timestamp('2005-02-21 00:00:00'), Timestamp('2005-04-28 00:00:00'), Timestamp('2005-08-24 00:00:00')]
df2['RxDate']
contains
[Timestamp('2004-08-20 00:00:00'), Timestamp('2004-08-23 00:00:00'), Timestamp('2004-08-18 00:00:00'), Timestamp('2004-08-15 00:00:00'), Timestamp('2004-08-12 00:00:00'), Timestamp('2004-08-13 00:00:00')]
I have tried looking at this a few ways and cannot see why I get the error?
If loop by d2
then RxDat
are columns names:
for RxDat in df2:
Use:
for RxDat in df2['RxDate']:
Non loop solution with broadcasting, output is 2d numpy array:
a = df['Tdate'].to_numpy()[:, None]
b = df2['RxDate'].sub(pd.to_timedelta(46, unit="D")).to_numpy()
c = df2['RxDate'].to_numpy()
condition = (a > b) & (a < c)