Regex function to replace Month texts from the input strings and replace them with numbers


I’m pretty new to Python and I just started learning Regex. This is a weird one. I’m working on a function to convert an input string which has months init such as ‘January’ and convert it to numerical i.e. ’01’. I understood what I should be doing but I kinda messed up the entire loop and it’s not giving me the appropriate result. Please ignore my stupidity regarding the extremely long if condition.

Here’s my code:

def transform(string):
    for i in k:
        for a in j:
            if a==k:
                j=re.sub(r"January", "01", j)
                j=re.sub(r"February", "02", j)
                j=re.sub(r"March", "03", j)
                j=re.sub(r"April", "04", j)
                j=re.sub(r"May", "05", j)
                j=re.sub(r"June", "06", j)
                j=re.sub(r"July", "07", j)
                j=re.sub(r"August", "08", j)
                j=re.sub(r"September", "09", j)
                j=re.sub(r"October", "10", j)
                j=re.sub(r"November", "11", j)
                j=re.sub(r"December", "12", j)
                return (' '.join(j))
                return('This is a string without a month in it')

print( transform('I was born on June 24 and my sister was born on May 17') )
# expected output: 'I was born on 06 24 and my sister was born on 05 17'

print( transform('This is a string without a month in it') )
# expected output: 'This is a string without a month in it'

Well. let me explain what I tried to do. I tried to split the input string and look for any equalities with terms in k[] if any just transform the list using re.sub() and then just join to print them. If there isn’t any value in split function which is also in k[] print that there’s no month text.

Please help correct my code. I know i was extremely bad with the loops but I’m seriously working on it. But I do want to use regex substitute to do this problem since that’s what I was trying to learn. Please help.

Asked By: JBlack



You may use

import re
def transform(text):
    dct = {'January':'01','February':'02','March':'03','April':'04','May':'05','June':'06','July':'07','August':'08','September':'09','October':'10','November':'11','December':'12'}
    output, n = re.subn(rf'b(?:{"|".join(dct.keys())})b', lambda x: dct[], text)
    if not n:
        return('This is a string without a month in it')
        return output

print( transform('I was born on June 24 and my sister was born on May 17') )
# => 'I was born on 06 24 and my sister was born on 05 17'

print( transform('This is a string without a month in it') )
# => 'This is a string without a month in it'

See the Python demo

The rf'b(?:{"|".join(dct.keys())})b' results in a pattern that searches for any month name as a whole word – (?:January|February|...) – and once the match is found, the match is passed to the re.subn lambda where the value for the key from the dct dictionary is returned.

Answered By: Wiktor Stribiżew

Use regex with dictionary data to replace:

import re

text = 'I was born on June 24 and my sister was born on May 17'

def transform(string):
    dict_data = {'January': '01', 'February': '02', 'March': '03', 'April': '04', 'May': '05', 'June': '06', 'July': '07',
                 'August': '08', 'September': '09', 'October': '10', 'November': '11', 'December': '12'}
    for key, value in dict_data.items():
        string = re.sub(key, value, string)
    return string



I was born on 06 24 and my sister was born on 05 17
Answered By: Zaraki Kenpachi

#example with perl using hash

$_ = "Mar-15 " ;

%mons = (‘JAN’=>"01",’FEB’=>"02",’MAR’=>"03",’APR’=>"04",’MAY’=>"05",’JUN’=>"06",’JUL’=>"07",’AUG’=>"08",’SEP’=>"09",’OCT’=>"10",’NOV’=>"11",’DEC’=>"12");

$_ = uc($_) ;

s/(JAN(?:UARY)?|FEB(?:RUARY)?|MAR(?:CH)?|APR(?:IL)?|MAY|JUN(?:E)?|JUL(?:Y)?|AUG(?:UST)?|SEP(?:TEMBER)?|OCT(?:OBER)?|(NOV|DEC)(?:EMBER)?)/$mons{$1}/egs ;

print $_ ; # 03-15

Answered By: Ian Rendak
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.