Get the first word after specific word in a column
Question:
I am trying to get the zip code after the specific word ‘zip_code’ within a string.
I have a data frame with a column named "location", in this column there is a string, I want to identify the word "zip_code" and get the value after this word for each row.
Input
name location
Bar1 LA Jefferson zip_code 202378 Avenue free
Pizza Avenue 45 zip_code 45623 wichita st
Tacos Las Americas avenue 7 zip_code 67890 nicolas st
Expected output
name location
Bar1 202378
Pizza 45623
Tacos 67890
So far, following an example I was able to extract the zip code for any string
str = "address1 355 Turnpike Ste 4 address3 zip_code 02021 country US "
str.split("zip_code")[1].split()[0]
>> 02021
But I do not know how to do the same for each row of my column location
Answers:
The best way is to use extract()
which accepts regex and allows searching through each row.
import pandas as pd
import numpy as np
df = pd.DataFrame({'name':['Bar1', 'Pizza', 'Tacos'],
'location':['LA Jefferson zip_code 202378 Avenue free', 'Avenue 45 zip_code 45623 wichita st', 'Las Americas avenue 7 zip_code 67890 nicolas st']})
df['location'] = df['location'].str.extract('zip_codes(.*?)s')
>>> df
name location
0 Bar1 202378
1 Pizza 45623
2 Tacos 67890
I am trying to get the zip code after the specific word ‘zip_code’ within a string.
I have a data frame with a column named "location", in this column there is a string, I want to identify the word "zip_code" and get the value after this word for each row.
Input
name location
Bar1 LA Jefferson zip_code 202378 Avenue free
Pizza Avenue 45 zip_code 45623 wichita st
Tacos Las Americas avenue 7 zip_code 67890 nicolas st
Expected output
name location
Bar1 202378
Pizza 45623
Tacos 67890
So far, following an example I was able to extract the zip code for any string
str = "address1 355 Turnpike Ste 4 address3 zip_code 02021 country US "
str.split("zip_code")[1].split()[0]
>> 02021
But I do not know how to do the same for each row of my column location
The best way is to use extract()
which accepts regex and allows searching through each row.
import pandas as pd
import numpy as np
df = pd.DataFrame({'name':['Bar1', 'Pizza', 'Tacos'],
'location':['LA Jefferson zip_code 202378 Avenue free', 'Avenue 45 zip_code 45623 wichita st', 'Las Americas avenue 7 zip_code 67890 nicolas st']})
df['location'] = df['location'].str.extract('zip_codes(.*?)s')
>>> df
name location
0 Bar1 202378
1 Pizza 45623
2 Tacos 67890