Parsing Data using Regex. Split it into columns via groups

Question:

I want to use REGEX to parse my data into 3 columns

Film data:
Marvel Comics Presents (1988) #125
Spider-Man Legends Vol. II: Todd Mcfarlane Book I (Trade Paperback)
Spider-Man Legends Vol. II: Todd Mcfarlane Book I
Spider-Man Legends Vol. II: Todd Mcfarlane Book I (1998)
Marvel Comics Presents #125

Expected output:
enter image description here

I can see how to group it, but can’t seem to REGEX it:
enter image description here

I built this expression: (.*)((d{4}))(.*)

I want to essentially use the ? quantifier to say the following:
(.*)((d{4}))**?**(.*)
sort of like saying this group may or may not be there?

Nevertheless, it’s not working.

Asked By: Benjamin Stringer

||

Answers:

You could use 2 capture groups, where the last 2 are optional:

^(.*?)(?:((d{4})))?s*(#d+)?$

The pattern matches:

  • ^ Start of string
  • (.*?) Capture group 1
  • (?:((d{4})))? Optional non capture group capturing 4 digits in group 2
  • s* match optional whitespace chars
  • (#d+)? Optional group 3, match # and 1+ digits
  • $ End of string

See a regex101 demo.

Answered By: The fourth bird
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.