re.findall() function python

Question:

Can you please help me to understand the following line of the code:

import re 
a= re.findall('[А-Яа-я-s]+', string)

I am a bit confused with the pattern that has to be found in the string. Particularly, a string should start with A and end with any string in-between A and я, should be separated by - and space, but what does the second term Яа stand for?

Asked By: Alberto Alvarez

||

Answers:

[         ]      any of the characters in here
 А-Я             any character from А and Я, inclusive
    а-я          any character between а and я, inclusive
       -         the character -   (this is ambiguous; it should only be at the very start or end of the class)
        s       any whitespace character
           +     at least one of the preceding class of characters

[А-Яа-я-s]+     at least one character between А and Я (uppercase or lowercase), a dash, or whitespace

the [] is called a "class" in regex, and it’s basically meant to say "any of the characters inside here is valid". And then + means "at least one occurrence of the preceding character/class".
Python has a Regular Expressions HowTo that you might find useful to read through.

Answered By: Green Cloak Guy
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.