Python RegEx for exact matches of brackets

Question:

I am trying to parse a string which is of the following format:

text = "some random string <inAngle> <anotherInAngle> [-option text] [-anotherOption ] [-option (Y|N)]"

I want to split the string in three parts.

  1. Just the "some random string"
  2. Everything that is ONLY in angle brackets. I.E inAngle and anotherInAngle above.
  3. Everything that is in square brackets.

If I use the RegEx

re.findall(r'[(.+?)]', text)

It gives everything I need within square brackets. If I use the same RegEx with angle brackets however,

re.findall(r'<(.+?)>', text)

It gives the text which is within angle bracket that are within square brackets too. So for example "text" from above which is within [-anotherOption]. I do not want that. The RegEx for angle bracket match should only return "inAngle" "anotherInAngle" from above.
What would be the RegEx for it?

Also how do I get only the first part i.e "some random string". This string can have 2 or 3 number of words

Asked By: user775093

||

Answers:

You can simply disregard everything between square brackets before searching for things in angle brackets:

interm = re.sub(r'[(.*?)]', '', text)
re.findall(r'<(.+?)>', interm)

outputs

['inAngle', 'anotherInAngle']

then for matching the first part, match everything up to [ or <. Granted this wont work if a string is allowed to randomly have either of these symbols unclosed embedded in the first part:

re.findall(r'([^<[]+)', text)[0]

outputs

some random string 
Answered By: David Zorychta

Try if this regex would capture what you need

s*([^><[]]+b)|[([^]]*)]|<([^>]*)>
  • s* preceded by optional whitespace
  • ([^><[]]+b) Group 1: Any non brackets until b (remove if undesired)
  • |[([^]]*)] or Group 2: What’s inside square brackets
  • |<([^>]*)> or Group 3: What’s inside angle brackets

See demo at regex101 (use “code generator” if needed)

Answered By: bobble bubble
<(.+?)>(?![^[]*])|[(.+?)]|((?!s+)[^[]<>]+)

You can simply use this re.findall.See demo.

https://regex101.com/r/hE4jH0/10

Answered By: vks
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.