Python's regex star quantifier not working as expected

Question

I’m trying to use regular expressions to select only groups of words within quotation marks.

Example.

Input:

this is 'a sentence' with less 'than twenty words'

Output:

['a sentence', 'than twenty words']

The regex I’m using is:

''[w]+[ ]+[[w]+[ ]+]*[w]+''

But it’s just returning the ‘than twenty words’. In fact, it only returns the strings with two spaces.

Asked By: Claudia

||

Source

Answer 1

Try this:

import re
re.findall(r"'(s*w+s+w[sw]*)'", input_string)

Demo

Answered By: Ahsanul Haque

Answer 2

import re 
sentence = "this is 'a sentence' with less 'than twenty words' and a 'lonely' word"
regex = re.compile(r"(?<=')w+(?:s+w+)+(?=')")
regex.findall(sentence)
# ['a sentence', 'than twenty words']

We want to capture strings starting and ending with quotes, without capturing them, so we use a positive lookbehind assertion (?<=') before, and a lookahead assertion (?=') afterwards.

Inside the quotes, we want to have at least one word, followed by at least one group of space and word. We don’t want it to be a capturing group, otherwise findall would return only this group, so we make it non-catching by using (?:....).

Answered By: Thierry Lathuille

Answer 3

This will deliver the strings between quotation marks, including words and spaces.

import re
st = "this is 'a sentence' with less 'than twenty words'"
re.findall(r"'([w|s]+)'", st)

Answered By: Saeed Ghareh Daghi

Answer 4

Late answer, but you can use:

import re
string = "this is 'a sentence' with less 'than twenty words'"
result = re.findall("'(.*?)'", string)
print result
# ['a sentence', 'than twenty words']

Python Demo
Regex Demo

Answered By: Pedro Lobito

Python's regex star quantifier not working as expected

Question:

Answers: