Error "sre_constants.error: unmatched group" when using Pattern.sub(r'1'


I know there have been several questions on this subject already, but none help me resolve my problem.

I have to replace names in a CSV document when they follow the tags {SPEAKER} or {GROUP OF SPEAKERS}.


The erroneous part of my script is:

list_speakers = re.compile(r'^{GROUP OF SPEAKERS}t(.*)|^{SPEAKER}t(.*)')

usernames = set()
for f in corpus:
    with open(f, "r", encoding=encoding) as fin:
        line = fin.readline()
        while line:
            line = line.rstrip()
            if not line:
                line = fin.readline()

            if not list_speakers.match(line):
                line = fin.readline()

            names = list_speakers.sub(r'1', line)
            names = names.split(", ")
            for name in names:

            line = fin.readline()


However, I receive the following error message :

File "/usr/lib/python2.7/", line 291, in filter
    return sre_parse.expand_template(template, match)
  File "/usr/lib/python2.7/", line 831, in expand_template
    raise error, "unmatched group"
sre_constants.error: unmatched group

I am using Python 2.7.

How can I fix this?

Asked By: Basile



The issue is a known one: if the group was not initialized, the backreference is not set to an empty string in Python versions up to 3.5.

You need to make sure there is only one or use a lambda expression as the replacement argument to implement custom replacement logic.

Here, you can easily revampt the regex into a pattern with a single capturing group:


See the regex demo


  • ^ – start of string
  • { – a {
  • (?:GROUP OF SPEAKERS|SPEAKER) – a non-capturing group matching either GROUP OF SPEAKERS or SPEAKER
  • } – a } (you may also write }, it does not need escaping)
  • t – a tab char
  • (.*) – Group 1: any 0+ chars other than line break chars, as many as possible (the rest of the line).