Subtract a year from a datetime column in pandas

Question:

I have a datetime column as below –

>>> df['ACC_DATE'].head(2)
538   2006-04-07
550   2006-04-12
Name: ACC_DATE, dtype: datetime64[ns]

Now, I want to subtract an year from each row of this column. How can I achieve the same & which library can I use?

The expected field –

        ACC_DATE    NEW_DATE
538   2006-04-07  2005-04-07
549   2006-04-12  2005-04-12
Asked By: 0nir

||

Answers:

You could use pd.Timedelta:

df["NEW_DATE"] = df["ACC_DATE"] - pd.Timedelta(days=365) 

Or replace:

df["NEW_DATE"] = df["ACC_DATE"].apply(lambda x: x.replace(year=x.year - 1))

But neither will catch leap years so you could use dateutil.relativedelta :

from dateutil.relativedelta import  relativedelta

df["NEW_DATE"] = df["ACC_DATE"].apply(lambda x: x - relativedelta(years=1))
Answered By: Padraic Cunningham

You can use DateOffset to achieve this:

In[88]:
df['NEW_DATE'] = df['ACC_DATE'] - pd.DateOffset(years=1)
df

Out[88]: 
        ACC_DATE   NEW_DATE
index                      
538   2006-04-07 2005-04-07
550   2006-04-12 2005-04-12
Answered By: EdChum

Use DateOffset:

df["NEW_DATE"] = df["ACC_DATE"] - pd.offsets.DateOffset(years=1)
print (df)
        ACC_DATE   NEW_DATE
index                      
538   2006-04-07 2005-04-07
550   2006-04-12 2005-04-12
Answered By: jezrael

If having a single pd.Timestamp object rather than a column,

  1. Using pd.DateOffset(years=n) is not ideal as it produces:

UserWarning: Discarding nonzero nanoseconds in conversion

  1. pd.Timedelta() doesn’t accept years.

The only approach that worked for me in this case is pd.Timestamp.replace:

t = pd.Timestamp.now()
t = t.replace(year=t.year - n)

This was hinted at in the answer by Padriac but it needed further clarity.

Answered By: Asclepius
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.