If any strings in a list match regex
Question:
I need to check if any of the strings in a list match a regex. If any do, I want to continue. The way I’ve always done it in the past is using list comprehension with something like:
r = re.compile('.*search.*')
if [line for line in output if r.match(line)]:
do_stuff()
Which I now realize is pretty inefficient. If the very first item in the list matches, we can skip all the rest of the comparisons and move on. I could improve this with:
r = re.compile('.*search.*')
for line in output:
if r.match(line):
do_stuff()
break
But I’m wondering if there’s a more pythonic way to do this.
Answers:
You can use the builtin any()
:
r = re.compile('.*search.*')
if any(r.match(line) for line in output):
do_stuff()
Passing in the lazy generator to any()
will allow it to exit on the first match without having to check any farther into the iterable.
Given that I am not allowed to comment yet, I wanted to provide a small correction to MrAlexBailey’s answer, and also answer nat5142’s question. Correct form would be:
r = re.compile('.*search.*')
if any(r.match(line) for line in output):
do_stuff()
If you desire to find the matched string, you would do:
lines_to_log = [line for line in output if r.match(line)]
In addition, if you want to find all lines that match any compiled regular expression in a list of compiled regular expressions r=[r1,r2,…,rn], you can use:
lines_to_log = [line for line in output if any(reg_ex.match(line) for reg_ex in r)]
Starting Python 3.8
, and the introduction of assignment expressions (PEP 572) (:=
operator), we can also capture a witness of an any
expression when a match is found and directly use it:
# pattern = re.compile('.*search.*')
# items = ['hello', 'searched', 'world', 'still', 'searching']
if any((match := pattern.match(x)) for x in items):
print(match.group(0))
# 'searched'
For each item, this:
- Applies the regex search (
pattern.match(x)
)
- Assigns the result to a
match
variable (either None
or a re.Match
object)
- Applies the truth value of
match
as part of the any expression (None
-> False
, Match
-> True
)
- If
match
is None
, then the any
search loop continues
- If
match
has captured a group, then we exit the any
expression which is considered True
and the match
variable can be used within the condition’s body
In reply to a question asked by @nat5142, in the answer given by @MrAlexBailey:
“Any way to access the matched string using this method? I’d like to print it for logging purposes”, assuming “this” implies to:
if any(re.match(line) for line in output):
do_stuff()
You can do a for loop over the generator
# r = re.compile('.*search.*')
for match in [line for line in output if r.match(line)]:
do_stuff(match) # <- using the matched object here
Another approach is mapping each match with the map function:
# r = re.compile('.*search.*')
# log = lambda x: print(x)
map(log, [line for line in output if r.match(line)])
Although this does not involve the “any” function and might not even be close to what you desire…
I thought this answer was not very relevant so here’s my second attempt…
I suppose you could do this:
# def log_match(match):
# if match: print(match)
# return match
if any(log_match(re.match(line)) for line in output):
do_stuff()
I need to check if any of the strings in a list match a regex. If any do, I want to continue. The way I’ve always done it in the past is using list comprehension with something like:
r = re.compile('.*search.*')
if [line for line in output if r.match(line)]:
do_stuff()
Which I now realize is pretty inefficient. If the very first item in the list matches, we can skip all the rest of the comparisons and move on. I could improve this with:
r = re.compile('.*search.*')
for line in output:
if r.match(line):
do_stuff()
break
But I’m wondering if there’s a more pythonic way to do this.
You can use the builtin any()
:
r = re.compile('.*search.*')
if any(r.match(line) for line in output):
do_stuff()
Passing in the lazy generator to any()
will allow it to exit on the first match without having to check any farther into the iterable.
Given that I am not allowed to comment yet, I wanted to provide a small correction to MrAlexBailey’s answer, and also answer nat5142’s question. Correct form would be:
r = re.compile('.*search.*')
if any(r.match(line) for line in output):
do_stuff()
If you desire to find the matched string, you would do:
lines_to_log = [line for line in output if r.match(line)]
In addition, if you want to find all lines that match any compiled regular expression in a list of compiled regular expressions r=[r1,r2,…,rn], you can use:
lines_to_log = [line for line in output if any(reg_ex.match(line) for reg_ex in r)]
Starting Python 3.8
, and the introduction of assignment expressions (PEP 572) (:=
operator), we can also capture a witness of an any
expression when a match is found and directly use it:
# pattern = re.compile('.*search.*')
# items = ['hello', 'searched', 'world', 'still', 'searching']
if any((match := pattern.match(x)) for x in items):
print(match.group(0))
# 'searched'
For each item, this:
- Applies the regex search (
pattern.match(x)
) - Assigns the result to a
match
variable (eitherNone
or are.Match
object) - Applies the truth value of
match
as part of the any expression (None
->False
,Match
->True
) - If
match
isNone
, then theany
search loop continues - If
match
has captured a group, then we exit theany
expression which is consideredTrue
and thematch
variable can be used within the condition’s body
In reply to a question asked by @nat5142, in the answer given by @MrAlexBailey:
“Any way to access the matched string using this method? I’d like to print it for logging purposes”, assuming “this” implies to:
if any(re.match(line) for line in output):
do_stuff()
You can do a for loop over the generator
# r = re.compile('.*search.*')
for match in [line for line in output if r.match(line)]:
do_stuff(match) # <- using the matched object here
Another approach is mapping each match with the map function:
# r = re.compile('.*search.*')
# log = lambda x: print(x)
map(log, [line for line in output if r.match(line)])
Although this does not involve the “any” function and might not even be close to what you desire…
I thought this answer was not very relevant so here’s my second attempt…
I suppose you could do this:
# def log_match(match):
# if match: print(match)
# return match
if any(log_match(re.match(line)) for line in output):
do_stuff()