Breaking a python string in pandas dataframe

Question:

I have a column ‘released’ which has values like ‘June 13, 1980 (United States)’

I want to get the year from this string so I tried using the following code

df['year_correct'] = df['released'].astype(str).str[',':'(']

But it is returning all the values as Nan in the new ‘year_correct’ column. Please help

Asked By: Zohaib Ahmed

||

Answers:

A better way might be to extract the 4 digits value using words delimiter (b) to ensure no more than 4 digits:

df['year_correct'] = df['released'].astype(str).str.extract(r'b(d{4})b')

Example:

                        released year_correct
0  June 13, 1980 (United States)         1980
Answered By: mozway