Regular Expression with Two Names: One With Middle Initial and One Without
Question:
I’m attempting to identify the names in this string, using regex (https://regex101.com).
Example text:
Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432
What I’ve tried so far only seems to work for names without a middle initial:
([A-Z]{1}[a-z]+) ([A-Z]{1}[a-z]+)
Note: Phone Numbers are random keystrokes. Please don’t try calling them.
Here’s an example of python code using the re package:
import re
strr = 'Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432'
def gimmethenamesdammit(strr):
regex = re.compile("([A-Z]{1}[a-z]+) ([A-Z]{1}[a-z]+)")
print(regex.findall(strr))
gimmethenamesdammit(strr)
To sum things up, please modify the regular expression above to highlight both the names Elon R. Musk
and Jeff Bezos
Desired python output when running gimmethenamesdammit(strr)
:
gimmethenamesdammit(strr)
[('Elon', 'R.', 'Musk'), ('Jeff', 'Bezos')]
Answers:
Try this: b([^s*][a-zA-Z_.s]+)b
Demo: https://regex101.com/r/7ul1pQ/1
b...b
— word boundary
[^s*][a-zA-Z_.s]+
— text with letters, dots and spaces
()
— captured group
The following regex expression solves the issue:
import re
strr = 'Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432'
regex = r"[A-Z]w+s[A-Z]?w+"
POCs = re.findall(regex, strr)
f"{POCs[0]}, {POCs[-1]}"
I’m attempting to identify the names in this string, using regex (https://regex101.com).
Example text:
Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432
What I’ve tried so far only seems to work for names without a middle initial:
([A-Z]{1}[a-z]+) ([A-Z]{1}[a-z]+)
Note: Phone Numbers are random keystrokes. Please don’t try calling them.
Here’s an example of python code using the re package:
import re
strr = 'Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432'
def gimmethenamesdammit(strr):
regex = re.compile("([A-Z]{1}[a-z]+) ([A-Z]{1}[a-z]+)")
print(regex.findall(strr))
gimmethenamesdammit(strr)
To sum things up, please modify the regular expression above to highlight both the names Elon R. Musk
and Jeff Bezos
Desired python output when running gimmethenamesdammit(strr)
:
gimmethenamesdammit(strr)
[('Elon', 'R.', 'Musk'), ('Jeff', 'Bezos')]
Try this: b([^s*][a-zA-Z_.s]+)b
Demo: https://regex101.com/r/7ul1pQ/1
b...b
— word boundary[^s*][a-zA-Z_.s]+
— text with letters, dots and spaces()
— captured group
The following regex expression solves the issue:
import re
strr = 'Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432'
regex = r"[A-Z]w+s[A-Z]?w+"
POCs = re.findall(regex, strr)
f"{POCs[0]}, {POCs[-1]}"