Python find substring between two markers

Question:

Please note i have read other answers on here but they havent worked for me (or I have applied them incorrectly, sorry if I have).

I have a list which I have then converted to Dataframe. I have then converted to string using:

df['URL'] = pd.Series(df['URL'], dtype="string")

However, when i go to use .find, .partition I get the error:

df['URL'].find('entry/')

AttributeError: 'Series' object has no attribute 'find'

string is as follows and i need to get the unique number between ‘entry/’ and ‘/event’. How can i do this?

https://fantasy.premierleague.com/entry/349289/event/14
Asked By: mgd6

||

Answers:

You have to use Series.str to access values of the series as strings so that you can start applying the string method(like .find, partition).

But a better approach in this case would be use extract which allows to extract capture groups in the regex entry/(d+)/event as columns

df['URL'].str.extract("entry/(d+)/event", expand=False)
Answered By: Abdul Niyas P M

If you just had a plain string (the URL) then you could isolate the value using a regular expression like this:

import re

url = 'https://fantasy.premierleague.com/entry/349289/event/14'

if (g := re.search(r'(?<=entry/)(d+?)(?=/event)', url)):
    print(g.group(1))
else:
    print('Not found')

Output:

349289
Answered By: OldBill
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.