srt handling with regex

Question:

Friends, how are you? I hope so!!

I need help using REGEX in python.

I need to validate some numbers, so that they are all in the same pattern. I explain:

Numbers (exemple):

0206071240004013000
04073240304015000
0001304-45.2034.4.01.2326

I need the script to read the numbers and change them so that they all have the following pattern:

  1. 20 numeric characters (The numbers that do not have must be added "0" to the left)
  • 04073240304015000 = 00004073240304015000
  1. Put "-" and "." this way:
  • 0000407-32.4030.4.01.5000

I was writing the code as follows: First I remove the non-numeric characters, then I check if it has 20 numeric characters, and if I don’t have it added. Now I need to put the score, but I’m having difficulties..

with open("num.txt", "r") as arquivo:
leitura = arquivo.readlines()
dados = leitura
for num in dados:
  non_numeric = re.sub("[^0-9]", "", num)
  characters = f'{non_numeric:0>20}'
Asked By: Gabriel Passos

||

Answers:

Try:

import re

txt = '''
0206071240004013000
04073240304015000
0001304-45.2034.4.01.2326'''

for n in re.findall(r'[d.-]+', txt):
    n = '{:0>20}'.format(n.replace('.', '').replace('-', ''))
    print('{}-{}.{}.{}.{}.{}'.format(n[:7], n[7:9], n[9:13], n[13:14], n[14:16], n[16:]))

Prints:

0020607-12.4000.4.01.3000
0000407-32.4030.4.01.5000
0001304-45.2034.4.01.2326

EDIT: To read the text from file and write to a new file you can do:

import re

with open('in.txt', 'r') as f_in, open('out.txt', 'w') as f_out:
    for n in re.findall(r'[d.-]+', f_in.read()):
        n = '{:0>20}'.format(n.replace('.', '').replace('-', ''))
        print('{}-{}.{}.{}.{}.{}'.format(n[:7], n[7:9], n[9:13], n[13:14], n[14:16], n[16:]), file=f_out)
Answered By: Andrej Kesely
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.