Extract the words between a number and alphabets
Question:
I have a string that has both numbers and words where I need to extract the specific word from the string
below is the string
2014). Research in fields like human–computer interactio n1Andrew Burton-Jones was the accepting senior editor for this paper
I need to extract the word between "1" and "was the accepting senior editor for this paper" where the output should be "Andrew Burton-Jones"
Below is the name that I have to extract
2014). Research in fields like human–computer interactio n1Andrew Burton-Jones was the accepting senior editor for this paper
There could be many formats of such string but the catch is I need to extract the words between a number and sentence i.e; "was the accepting senior editor for this paper"
Below is the code that I tried
import re
test_string = "2014). Research in fields like human–computer interactio n1Andrew Burton-Jones was the accepting senior editor for this paper"
# Define the regular expression pattern
pattern = r"d+(.*?)(?= was the accepting senior editor for this paper)"
# Search for the pattern in the input string
match1 = re.search(pattern, test_string)
# If the pattern is found, extract the author name
if match1:
author_name = match1.group(1)
print(author_name.strip())
but the above code is not giving the desired output. Can anyone help me out?
Answers:
You can use this regular expression to get group 1, which is exactly what you need.
d(D*?) was the accepting senior editor for this paper
It is recommended that you use some online tools to debug your regular expressions,like https://regex101.com/
I have a string that has both numbers and words where I need to extract the specific word from the string
below is the string
2014). Research in fields like human–computer interactio n1Andrew Burton-Jones was the accepting senior editor for this paper
I need to extract the word between "1" and "was the accepting senior editor for this paper" where the output should be "Andrew Burton-Jones"
Below is the name that I have to extract
2014). Research in fields like human–computer interactio n1Andrew Burton-Jones was the accepting senior editor for this paper
There could be many formats of such string but the catch is I need to extract the words between a number and sentence i.e; "was the accepting senior editor for this paper"
Below is the code that I tried
import re
test_string = "2014). Research in fields like human–computer interactio n1Andrew Burton-Jones was the accepting senior editor for this paper"
# Define the regular expression pattern
pattern = r"d+(.*?)(?= was the accepting senior editor for this paper)"
# Search for the pattern in the input string
match1 = re.search(pattern, test_string)
# If the pattern is found, extract the author name
if match1:
author_name = match1.group(1)
print(author_name.strip())
but the above code is not giving the desired output. Can anyone help me out?
You can use this regular expression to get group 1, which is exactly what you need.
d(D*?) was the accepting senior editor for this paper
It is recommended that you use some online tools to debug your regular expressions,like https://regex101.com/