split by comma if comma not in between brackets while allowing characters to be outside the brackets with in the same comma split

Question:

I have this python script. That uses some regular expression.
I want to split the string s, but commas while ignoring any commas that exists within the brackets.

s = """aa,bb,(cc,dd),m(ee,ff)"""
splits = re.split(r's*(([^)]*)|[^,]+)', s, re.M|re.S)
print('n'.join(splits))
Actual output:
    aa
    ,
    bb
    ,
    (cc,dd)
    ,
    m(ee
    ,
    ff)
Desired output: 
    aa
    bb
    (cc,dd)
    m(ee,ff)

So I can’t make it handle having text outside the brackets.
Was hoping someone could help me out.

Asked By: h33

||

Answers:

Consider using findall instead – repeat a group that matches (s followed by non-) characters, followed by ), or matches non-, characters:

s = """aa,bb,m(cc,dd)"""
matches = re.findall(r'(?:([^(]+)|[^,])+', s, re.M|re.S)
print('n'.join(matches))

If speed is an issue, you can make it a bit more efficient by putting ( in the other negative character set, and alternating it first:

(?:[^(,]+|([^(]+))+
Answered By: CertainPerformance

You may use this regex with a lookahead for split:

>>> s = """aa,bb,(cc,dd),m(ee,ff)"""
>>> print ( re.split(r',(?![^()]*))', s) )
['aa', 'bb', '(cc,dd)', 'm(ee,ff)']

RegEx Demo

RegEx Details:

  • ,: Match a comma
  • (?![^()]*)): A negative lookahead assertion that makes sure we don’t match comma inside (...) by asserting that there is no ) ahead after 0 or more not bracket characters.
Answered By: anubhava

try : r’,([^,()][(][^()][)][^,])|([^,]+)’

tested on regex101 : https://regex101.com/r/pJxRwQ/1

Answered By: shikai ng

I needed to do something similar, but I also had nested brackets.
The proposed regex expressions do NOT handle nesting.

I couldn’t find a regex solution, but here is a python function solution that achieves the same thing:

def comma_split(text: str) -> list[str]:
    flag = 0
    buffer = ""
    result = []
    for char_ in text:
        if char_ == "[":
            flag += 1
        elif char_ == "]":
            flag -= 1
        elif char_ == "," and flag == 0:
            result.append(buffer)
            buffer = ""
            continue
        buffer += char_
    if buffer:
        result.append(buffer)
    return result
Answered By: basil_man
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.