Trying to understand the difference in what matches and the resulting output for findall vs finditer

Question:

  1. Using findall:
import re

target_string = "please sir, that's obviously a clip-on."

result = re.findall(r"[a-z]+('[a-z])?[a-z]*", target_string)

print(result)

# result: ['', '', "'s", '', '', '', '']
  1. Using finditer:
import re

target_string ="please sir, that's obviously a clip-on."

result = re.finditer(r"[a-z]+('[a-z])?[a-z]*", target_string)
matched = []
    
for match_obj in result:
    matched.append(match_obj.group())

print(matched)
    
# result: ['please', 'sir', "that's", 'obviously', 'a', 'clip', 'on']

How does these two methods match patterns and why is there a difference in resulting output. Please explain.

Tried to read the docs but still confused on the workings of findall vs finditer

Asked By: saturn366

||

Answers:

In the findall case, the output will be the capturing group ('[a-z]).
If you want the full match transform your group into a non-capturing one (?:'[a-z]):

target_string = "please sir, that's obviously a clip-on."
result = re.findall(r"[a-z]+(?:'[a-z])?[a-z]*", target_string)
print(result)

Output:

['please', 'sir', "that's", 'obviously', 'a', 'clip', 'on']

Note that if you have multiple capturing groups, findall will return a tuple of them:

re.findall(r"([a-z]+('[a-z])?[a-z]*)", target_string)

[('please', ''), ('sir', ''), ("that's", "'s"), ('obviously', ''), ('a', ''), ('clip', ''), ('on', '')]
Answered By: mozway