srt handling with regex
Question:
Friends, how are you? I hope so!!
I need help using REGEX in python.
I need to validate some numbers, so that they are all in the same pattern. I explain:
Numbers (exemple):
0206071240004013000
04073240304015000
0001304-45.2034.4.01.2326
I need the script to read the numbers and change them so that they all have the following pattern:
- 20 numeric characters (The numbers that do not have must be added "0" to the left)
- 04073240304015000 = 00004073240304015000
- Put "-" and "." this way:
- 0000407-32.4030.4.01.5000
I was writing the code as follows: First I remove the non-numeric characters, then I check if it has 20 numeric characters, and if I don’t have it added. Now I need to put the score, but I’m having difficulties..
with open("num.txt", "r") as arquivo:
leitura = arquivo.readlines()
dados = leitura
for num in dados:
non_numeric = re.sub("[^0-9]", "", num)
characters = f'{non_numeric:0>20}'
Answers:
Try:
import re
txt = '''
0206071240004013000
04073240304015000
0001304-45.2034.4.01.2326'''
for n in re.findall(r'[d.-]+', txt):
n = '{:0>20}'.format(n.replace('.', '').replace('-', ''))
print('{}-{}.{}.{}.{}.{}'.format(n[:7], n[7:9], n[9:13], n[13:14], n[14:16], n[16:]))
Prints:
0020607-12.4000.4.01.3000
0000407-32.4030.4.01.5000
0001304-45.2034.4.01.2326
EDIT: To read the text from file and write to a new file you can do:
import re
with open('in.txt', 'r') as f_in, open('out.txt', 'w') as f_out:
for n in re.findall(r'[d.-]+', f_in.read()):
n = '{:0>20}'.format(n.replace('.', '').replace('-', ''))
print('{}-{}.{}.{}.{}.{}'.format(n[:7], n[7:9], n[9:13], n[13:14], n[14:16], n[16:]), file=f_out)
Friends, how are you? I hope so!!
I need help using REGEX in python.
I need to validate some numbers, so that they are all in the same pattern. I explain:
Numbers (exemple):
0206071240004013000
04073240304015000
0001304-45.2034.4.01.2326
I need the script to read the numbers and change them so that they all have the following pattern:
- 20 numeric characters (The numbers that do not have must be added "0" to the left)
- 04073240304015000 = 00004073240304015000
- Put "-" and "." this way:
- 0000407-32.4030.4.01.5000
I was writing the code as follows: First I remove the non-numeric characters, then I check if it has 20 numeric characters, and if I don’t have it added. Now I need to put the score, but I’m having difficulties..
with open("num.txt", "r") as arquivo:
leitura = arquivo.readlines()
dados = leitura
for num in dados:
non_numeric = re.sub("[^0-9]", "", num)
characters = f'{non_numeric:0>20}'
Try:
import re
txt = '''
0206071240004013000
04073240304015000
0001304-45.2034.4.01.2326'''
for n in re.findall(r'[d.-]+', txt):
n = '{:0>20}'.format(n.replace('.', '').replace('-', ''))
print('{}-{}.{}.{}.{}.{}'.format(n[:7], n[7:9], n[9:13], n[13:14], n[14:16], n[16:]))
Prints:
0020607-12.4000.4.01.3000
0000407-32.4030.4.01.5000
0001304-45.2034.4.01.2326
EDIT: To read the text from file and write to a new file you can do:
import re
with open('in.txt', 'r') as f_in, open('out.txt', 'w') as f_out:
for n in re.findall(r'[d.-]+', f_in.read()):
n = '{:0>20}'.format(n.replace('.', '').replace('-', ''))
print('{}-{}.{}.{}.{}.{}'.format(n[:7], n[7:9], n[9:13], n[13:14], n[14:16], n[16:]), file=f_out)