pandas remove all words before a specific word and get the first n words after that specific word

Question:

I have a dataframe like this:

df=pd.DataFrame({'caption':'hello this pack is for you: Jake Peralta. Thanks'})
df

caption
hello this pack is for you: Jake Peralta. Thanks
...
...
...

I’m trying to get the recipient’s first and last name here. The format of the caption column is always the same. So delete everything before for you: and get the first 2(this number may change) words after for you:

Asked By: Clegane

||

Answers:

here is one way :

df.caption.apply(lambda st: st[st.find(":")+2:st.find(".")])

output :

0     Jake Peralta
Name: caption, dtype: object
Answered By: eshirvana

May be you can try like this

df['caption'].str.split("for you: ").str[1].str.split('.').str[0]

output:

0    Jake Peralta
1      first last
Answered By: Deepan

Takes care of leading spaces in name:

>>> df.caption.str.split(".").str[0].str.split(":").str[1].str.strip()

1    Jake Peralta
Name: caption, dtype: object
Answered By: the_ordinary_guy
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.