Is there a way to subtract from the date a year specified in another column in python?

Question:

Today I have confronted some challenges.

This is an example dataset:

example = {
"a": ['1/1/1954 14:14','2/14/2001 2:00' , '2/15/2002 12:00'],
"b": [1936,1996,1960],
}

#load into df:
example = pd.DataFrame(example)

print(example) 

What I was trying to do is:

example['c'] = example['a'] - example['b']

However, I got the issue:

unsupported operand type(s) for -: 'str' and 'int'

I tried to convert the string to the integer, but it did not work.

Could you please recommend me some package or a method to deal with this subtraction? I have heard about datetime, but I am not sure how to set the dates from column "a" accordingly.

Thank you in advance!

Asked By: Shu

||

Answers:

Convert values to datetimes and extract years:

y = pd.to_datetime(example['a']).dt.year
example['c'] = y - example['b']

Or extract integers with length 4 between / and space:

y = example['a'].str.extract(r'/(d{4})s+', expand=False).astype(int)
example['c'] = y - example['b']
Answered By: jezrael