Regex exact match
Question:
I have the following sentence:
"The size of the lunch box is around 1.5l or 1500ml"
How can I change this to:
"The size of the lunch box is around 1.5 liter or 1500 milliliter"
In some cases, the value might also be present as "1.5 l or 1500 ml" with a space.
I am not be able to capture the "l" or "ml" when I am trying to build a function, or it is giving me an escape error.
I tried:
def stnd(text):
text = re.sub('^l%',' liter', text)
text = re.sub('^ml%',' milliliter', text)
text = re.sub('^d+.d+s*l$','^d+.d+s*liter$', text)
text = re.sub('^^d+.d+s*ml$%','^d+.d+s*milliliter$', text)
return text
Answers:
We can handle this replacement using a dictionary of lookup values and replacements.
d = {"l": "liter", "ml": "milliliter"}
inp = "The size of the lunch box is around 1.5l or 1500ml"
output = re.sub(r'(d+(?:.d+)?)s*(ml|l)', lambda m: m.group(1) + " " + d[m.group(2)], inp)
print(output)
# The size of the lunch box is around 1.5 liter or 1500 milliliter
def stnd(text):
return re.sub(r'(d+(?:.d+)?)s*(m?l)', lambda m: m.group(1) + " " + d[m.group(2)], text)
You could use a dict to list all the units as the key, and use a pattern to find a digit followed by either ml
or l
which you could then use as the key for the dict to get the value.
(?<=d)m?lb
The pattern matches:
(?<=d)
Positive lookbehind, assert a digit to the left
m?lb
Match an optional m
followed by b and a word boundary
See a regex demo.
Example
s = "The size of the lunch box is around 1.5l or 1500ml"
pattern = r"(?<=d)m?lb"
dct = {
"ml": "milliliter",
"l": "liter"
}
result = re.sub(pattern, lambda x: " " + dct[x.group()] if x.group() in dct else x, s)
print(result)
Output
The size of the lunch box is around 1.5 liter or 1500 milliliter
I have the following sentence:
"The size of the lunch box is around 1.5l or 1500ml"
How can I change this to:
"The size of the lunch box is around 1.5 liter or 1500 milliliter"
In some cases, the value might also be present as "1.5 l or 1500 ml" with a space.
I am not be able to capture the "l" or "ml" when I am trying to build a function, or it is giving me an escape error.
I tried:
def stnd(text):
text = re.sub('^l%',' liter', text)
text = re.sub('^ml%',' milliliter', text)
text = re.sub('^d+.d+s*l$','^d+.d+s*liter$', text)
text = re.sub('^^d+.d+s*ml$%','^d+.d+s*milliliter$', text)
return text
We can handle this replacement using a dictionary of lookup values and replacements.
d = {"l": "liter", "ml": "milliliter"}
inp = "The size of the lunch box is around 1.5l or 1500ml"
output = re.sub(r'(d+(?:.d+)?)s*(ml|l)', lambda m: m.group(1) + " " + d[m.group(2)], inp)
print(output)
# The size of the lunch box is around 1.5 liter or 1500 milliliter
def stnd(text):
return re.sub(r'(d+(?:.d+)?)s*(m?l)', lambda m: m.group(1) + " " + d[m.group(2)], text)
You could use a dict to list all the units as the key, and use a pattern to find a digit followed by either ml
or l
which you could then use as the key for the dict to get the value.
(?<=d)m?lb
The pattern matches:
(?<=d)
Positive lookbehind, assert a digit to the leftm?lb
Match an optionalm
followed by b and a word boundary
See a regex demo.
Example
s = "The size of the lunch box is around 1.5l or 1500ml"
pattern = r"(?<=d)m?lb"
dct = {
"ml": "milliliter",
"l": "liter"
}
result = re.sub(pattern, lambda x: " " + dct[x.group()] if x.group() in dct else x, s)
print(result)
Output
The size of the lunch box is around 1.5 liter or 1500 milliliter