Looping through python regex matches

Question:

I want to turn a string that looks like this:

ABC12DEF3G56HIJ7

into

12 * ABC
3  * DEF
56 * G
7  * HIJ

I want to construct the correct set of loops using regex matching. The crux of the issue is that the code has to be completely general because I cannot assume how long the [A-Z] fragments will be, nor how long the [0-9] fragments will be.

Asked By: da5id

||

Answers:

Python’s re.findall should work for you.

Live demo

import re

s = "ABC12DEF3G56HIJ7"
pattern = re.compile(r'([A-Z]+)([0-9]+)')

for (letters, numbers) in re.findall(pattern, s):
    print(numbers, '*', letters)
Answered By: Ray Toal

It is better to use re.finditer if your dataset is large because that reduces memory consumption (findall() return a list of all results, finditer() finds them one by one).

import re

s = "ABC12DEF3G56HIJ7"
pattern = re.compile(r'([A-Z]+)([0-9]+)')

for m in re.finditer(pattern, s):
    print m.group(2), '*', m.group(1)
Answered By: Mithril

Yet another option could be to use re.sub() to create the desired strings from the captured groups:

import re
s = 'ABC12DEF3G56HIJ7'
for x in re.sub(r"([A-Z]+)(d+)", r'2 * 1,', s).rstrip(',').split(','):
    print(x)

12 * ABC
3 * DEF
56 * G
7 * HIJ
Answered By: cottontail
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.