Breaking a python string in pandas dataframe
Question:
I have a column ‘released’ which has values like ‘June 13, 1980 (United States)’
I want to get the year from this string so I tried using the following code
df['year_correct'] = df['released'].astype(str).str[',':'(']
But it is returning all the values as Nan in the new ‘year_correct’ column. Please help
Answers:
A better way might be to extract the 4 digits value using words delimiter (b
) to ensure no more than 4 digits:
df['year_correct'] = df['released'].astype(str).str.extract(r'b(d{4})b')
Example:
released year_correct
0 June 13, 1980 (United States) 1980
I have a column ‘released’ which has values like ‘June 13, 1980 (United States)’
I want to get the year from this string so I tried using the following code
df['year_correct'] = df['released'].astype(str).str[',':'(']
But it is returning all the values as Nan in the new ‘year_correct’ column. Please help
A better way might be to extract the 4 digits value using words delimiter (b
) to ensure no more than 4 digits:
df['year_correct'] = df['released'].astype(str).str.extract(r'b(d{4})b')
Example:
released year_correct
0 June 13, 1980 (United States) 1980