Can some one explain me the this regex pattern in python? re.findall("[a-zA-Z*,*-!*.]"

Question:

re.findall("[a-zA-Z*,*-!*.]"

This regular expression checks for valid letters, ".", "-", "-", "!" in a word.
I understood the first part

[a-zA-Z]

Can someone please explain this?

[*,*-!*.]
Asked By: anonymous a

||

Answers:

You can use the online tool regex101 to understand your regex and test it, for example, here is the explanation for this regex:

Match a single character present in the list below [a-zA-Z*,*-!*.]
a-z matches a single character in the range between a (index 97) and z (index 122) (case sensitive)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
*,* matches a single character in the list *, (case sensitive)
* matches the character * with index 4210 (2A16 or 528) literally (case sensitive)
, matches the character , with index 4410 (2C16 or 548) literally (case sensitive)
- matches the character - with index 4510 (2D16 or 558) literally (case sensitive)
!*. matches a single character in the list !*. (case sensitive)
! matches the character ! with index 3310 (2116 or 418) literally (case sensitive)
* matches the character * with index 4210 (2A16 or 528) literally (case sensitive)
. matches the character . with index 4610 (2E16 or 568) literally (case sensitive)
Answered By: Hussein Awala

That’s a badly designed regex.

[*,*-!*.] it’s looking for one of the characters *, ,, -, ! or ..

The * appears three times, which is unnecessary since the regex is going to be consumed after a single character appears.

Whoever wrote it was probably thinking in doing any letter, ,, -, ! or . 0 or more times, so the correct regex would be [a-zA-Z,-!.]*.

Example:

example = 'foo,bar,baz!'
print(re.findall("[a-zA-Z*,*-!*.]", example))
print(re.findall("[a-zA-Z,-!*.]", example))  # same as before, but with only one *
print(re.findall("[a-zA-Z,-!.]*", example))

outputs:

['f', 'o', 'o', ',', 'b', 'a', 'r', ',', 'b', 'a', 'z', '!']
['f', 'o', 'o', ',', 'b', 'a', 'r', ',', 'b', 'a', 'z', '!']
['foo,bar,baz!', '']

(the empty string is due to the *, which allows for no-occurrences to be recognized positively, and could be fixed by using + instead)

Answered By: Ignatius Reilly
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.