Python Pandas Extract text between a word and a symbol

Question

I am trying to extract text between a word and a symbol.

Here is the input table.

And my expected output is like this.

I do not want to have the word ‘Team:’ and ‘<>’ in the output.

I tried something like this but it keeps the ‘Team:’ and ‘<>’ in the output: data[new col]=data[‘Team’].str.extract(r'(Team:s[a-zA-Zs]+<>)

Thank you.

||

Answer 1

Use regex captured group for str.extract method:

df['Team'].str.extract(r'^Team: ([^<>]+)')

Answer 2

You can do this with a regular expression as this would account for countries with spaces and any N length.

import re

row_string = "Team: United States <>"
country_name = re.search(r'Team: (.*) <>', row_string).group(1)

Answered By: iohans

Answer 3

The reason is because you have the capture group around the whole match, which will be returned by str.extract

You could write it using the group only around the part that you want to keep:

df['Team'].str.extract(r'Team:s([a-zA-Zs]+)<>')

See the capture group values at this regex101 demo.

Question: