Regular expression for complex numbers

Question:

So I’am trying to write regular expression for complex numbers (just as an exercise to study re module). But I can’t get it to work. I want regex to match strings of form: ’12+18j’, ‘-14+45j’, ’54’, ‘-87j’ and so on. My attempt:

import re

num = r'[+-]?(?:d*.d+|d+)'
complex_pattern = rf'(?:(?P<real>{num})|(?P<imag>{num}j))|(?:(?P=real)(?P=imag))'
complex_pattern = re.compile(complex_pattern)

But it doesn’t really work as I want.

m = complex_pattern.fullmatch('1+12j')
m.groupdict()

Out[166]: {'real': None, 'imag': '1+12j'}

The reason behind its structure is the fact that I want input string to contain either real or imaginary part or both. And also to be able to extract real and imag groups from match object. There is other approach i tried and it seems to work except it catches empty strings (”):

complex_pattern = rf'(?P<real>{num})+(?P<imag>{num}j)+'
complex_pattern = re.compile(complex_pattern)

I guess I could implement check for empty string simply using if. But I’m interested in more pure way and to know why first implementation doesn’t work as expected.

Asked By: Aisec Nory

||

Answers:

Does this work for what you want?

import re
words= '+122+6766j'
pattern = re.compile(r'((^[-+]?(?P<real>d+))?[-+]?(?P<img>d{2,}j?w)?)')
pattern.fullmatch(words).groupdict()

Output

{'real': '122', 'img': '6766j'}
Answered By: Ade_1

I suggest using

import re
pattern = r'^(?!$)(?P<real>(?P<sign1>[+-]?)(?P<number1>d+(?:.d+)?))?(?:(?P<imag>(?P<sign2>[+-]?)(?P<number2>d+(?:.d+)?j)))?$'
texts = ['1+12j', '12+18j','-14+45j','54','-87j']
for text in texts:
    match = re.fullmatch(pattern, text)
    if match:
        print(text, '=>', match.groupdict())
    else:
        print(f'{text} did not match!')

See the Python demo. Output:

1+12j => {'real': '1', 'sign1': '', 'number1': '1', 'imag': '+12j', 'sign2': '+', 'number2': '12j'}
12+18j => {'real': '12', 'sign1': '', 'number1': '12', 'imag': '+18j', 'sign2': '+', 'number2': '18j'}
-14+45j => {'real': '-14', 'sign1': '-', 'number1': '14', 'imag': '+45j', 'sign2': '+', 'number2': '45j'}
54 => {'real': '54', 'sign1': '', 'number1': '54', 'imag': None, 'sign2': None, 'number2': None}
-87j => {'real': '-8', 'sign1': '-', 'number1': '8', 'imag': '7j', 'sign2': '', 'number2': '7j'}

See the regex demo.

Details

  • ^ – start of string
  • (?!$) – no end of string should follow at this position (no empty input is allowed)
  • (?P<real>(?P<sign1>[+-]?)(?P<number1>d+(?:.d+)?))? – a "real" group:
    • (?P<sign1>[+-]?) – an optional - or + sign captured into Group "sign1"
    • (?P<number1>d+(?:.d+)?) – one or more digits followed with an optional sequence of a . and one or more digits captured into Group "number1"
  • (?P<imag>(?P<sign2>[+-]?)(?P<number2>d+(?:.d+)?j))? – an optional sequence captured into "imag" group:
    • (?P<sign2>[+-]?) – an optional - or + sign captured into Group "sign2"
    • (?P<number2>d+(?:.d+)?j) – one or more digits followed with an optional sequence of a . and one or more digits and then a j char captured into Group "number2"
  • $ – end of string.
Answered By: Wiktor Stribiżew

Even though I accepted Wiktor Stribiżew’s answer and consider it really good. I have to add something that I noticed. Firstly, last string in texts list didn’t grouped correctly (i.e. ‘-87j’ -> real: -8; imag: 7j). To address this I propose following changes to simplified version of his answer:

import re

num = r'[+-]?(?:d*.d+|d+)'
pattern = rf'(?!$)(?P<real>{num}(?!d))?(?P<imag>{num}j)?'

texts = ['1+12j', '12+18j','-14+45j','54','-87j']

for text in texts:
    match = re.fullmatch(pattern, text)
    if match:
        print(f'{text:>7} => {match.groupdict()}')
    else:
        print(f'{text:>7} did not match!')

Output:

1+12j   => {'real': '1', 'imag': '+12j'}
 12+18j => {'real': '12', 'imag': '+18j'}
-14+45j => {'real': '-14', 'imag': '+45j'}
     54 => {'real': '54', 'imag': None}
   -87j => {'real': None, 'imag': '-87j'}

The important diffrence here is adding (?!d) to ‘real’ group of regex, to prevent strings like ‘-87j’ to be splitted into ‘-8’ and ‘7j’.

Answered By: Aisec Nory

Just for completeness, I wanted to add a solution which also allows basic scientific notation, and also use of i or j. I answer this only if other people came here like me to seek a regular expression which can find complex numbers, and for this key fact, a number with no imaginary part does not return as a match.

It deviates from the original question because of matching groups but could be changed, see commented out line with cx_num_groups.

This expression does not include matching groups for real and imaginary part because it allows for numbers such as 2j.

def _complex_re_gen():
    '''
    Because it is complicated, returns a string which returns a match with complex numbers.
    '''
    num = r'(?:[+-]?(?:d*.)?d+)'
    num_sci = r'(?:{num}(?:e[+-]?d+)?)'.format(num=num)
    cx_num = r'(?:{num_sci}?{num_sci}[ij])'.format(num_sci=num_sci)
    #cx_num_groups = cx_num = r'(?:(P<real>{num_sci})?(P<img>{num_sci}[ij])?)'.format(num_sci=num_sci)
    cx_match_wrapped= r"^(?:{cx_num}|({cx_num}))$".format(cx_num=cx_num)
    return cx_match_wrapped

With following test strings, this regexp returns a match for the commented ones:

cmplx_tests = [
        '1 + 2j'    , #no match
        '1e5-2e-2j' , #match
        'i2 +4j'    , #no match
        '1.25'      , #no match
        '-5-3.2i'   , #match
        '64.2-3.9j' , #no match
    ]

This post was written in part because I wanted to solve a problem in this post, with parsing complex arrays inside of parameter files.

Answered By: punyidea
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.