Parsing data from the squished string

Question:

I need to write a pattern using Regex, which from the string "PriitPann39712047623+372 5688736402-12-1998Oja 18-2,Pärnumaa,Are" will return a first name, last name, id code, phone number, date of birth and address. There are no hard requirements beside that both the first and last names always begin with a capital letter, the id code always consists of 11 numbers, the phone number calling code is +372 and the phone number itself consists of 8 numbers, the date of birth has the format dd-mm-yyyy, and the address has no specific pattern.

That is, taking the example above, the result should be [("Priit", "Pann", "39712047623", "+372 56887364", "02-12-1998", "Oja 18-2,Parnumaa,Are")]. I got this pattern

r"([1-9][0-9]{10})(+d{3}s*d{7,8})(d{1,2} -d{1,2}-d{1,4})"

however it returns everything except first name, last name and address. For example, ^[^0-9]* returns both the first and last name, however I don’t understand how to make it return them separately. How can it be improved so that it also separately finds both the first and last name, as well as the address?

Asked By: QLimbo

||

Answers:

The following regex splits each of the fields into a separate group.

r"([A-Z]+[a-z]+)([A-Z]+[a-z]+)([0-9]*)(+372 [0-9]{8,8})([0-9]{2,2}-[0-9]{2,2}-[0-9]{4,4})(.*$)"

You can get each group by calling

m = re.search(regex, search_string)
for i in range(num_fields):
    group_i = m.group(i)
Answered By: max
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.