Changing bad format of number and currency from user input to float number

Question:

I need to write a script in Python which will transform bad input from user to float number.

For example
"10,123.20 Kč" to "10123.2"
"10.023,123.45 Kč" to "10023123.45"
"20 743 210.2 Kč" to "20743210.2"
or any other bad input – this is what I’ve come up with.

Kč is Czech koruna.

My thought process was to get rid of any spaces, letters. Then change every comma to dot to make numbers looks like "123.123.456.78" then delete all dots except of last one in a string and then change it to float so it would looks like "123123456.78". But I don’t know how to do it. If you know any faster and easier way to do it, I would like to know.

This is what I have and I’m lost now.

import re

my_list = ['100,30 Kč','10 000,00 Kč', '10,000.00 Kč', '10000 Kč', '32.100,30 Kč', '12.345,678.91 Kč']

for i in my_list:
    ws = i.replace("Kč", '')
    x = re.sub(',','.', ws).replace(" ","")
    print(x)
Asked By: HonzaVole

||

Answers:

This should do the job.

def parse_entry(entry):

    #remove currency and spaces
    entry = entry.replace("Kč", "")
    entry = entry.replace(" ", "")

    #check if a comma is used for decimals or thousands
    comma_i = entry.find(",")
    if len(entry[comma_i:]) > 3: #it's a thousands separator, it can be removed
        entry = entry.replace(",", "")
    else: #it's a decimal separator
        entry = entry.replace(",", ".") #convert it to dot

    #remove extra dots
    while entry.count(".") > 1:
        entry = entry.replace(".", "", 1) #replace once
    return round(float(entry), 1) #round to 1 decimal

my_list = ['100,30 Kč','10 000,00 Kč', '10,000.00 Kč', '10000 Kč', '32.100,30 Kč', '12.345,678.91 Kč']
parsed = list(map(parse_entry, my_list))
print(parsed) #[100.3, 10000.0, 10000.0, 10000.0, 32100.3, 12345678.9]
Answered By: alec_djinn

You could select the find all numerics instead of trying to remove non-numerics

In any case you have to make some assumtpions about the input, here is the code assuming that a final block of two digits in a text with separators is the fractional part.

import re

my_list = ['100,30 Kč','10 000,00 Kč', '10,000.00 Kč', '10000 Kč', '32.100,30 Kč', '12.345,678.91 Kč']

for s in my_list:
    parts = list(re.findall('d+', s))
    if len(parts) == 1 or len(parts[-1]) != 2:
        parts.append('0')
    print(float(''.join(parts[:-1]) + '.' + parts[-1]))
Answered By: Bob

I tried to keep your code and add just few lines. The idea is the store in a variable the number after "." and then add it after replacing the "," with "." and join the number separated by ".".

import re

my_list = ['100,30 Kč','10 000,00 Kč', '10,000.00 Kč', '10000 Kč', '32.100,30 Kč', '12.345,678.91 Kč']

for i in my_list:
    ws = i.replace("Kč", '')
    x = re.sub(',','.', ws).replace(" ","")
   
    if len( x.split("."))>1:
        end= x.split(".")[-1]
        x = "".join([i for i in x.split(".")[:-1]])+"."+end
    print(x)
Answered By: alphaBetaGamma

Whilst the other answers work for your specific scenario (e.g. you know the current code you’re replacing), it’s not very extensible.

So here’s a more generic approach:

import re

values = [
    "100,30 Kč",
    "10 000,00 Kč",
    "10,000.00 Kč",
    "10000 Kč",
    "32.100,30 Kč",
    "12.345,678.91 Kč",  # This value is a bit odd... is it _right_?
]

for value in values:
    # Remove any character that's not a number or a comma
    value = re.sub("[^0-9,]", "", value)

    # Replace remaining commas with periods
    value = value.replace(",", ".")

    # Convert from string to number
    value = float(value)

    print(value)

This outputs:

100.3
10000.0
10.0
10000.0
32100.3
12345.67891
Answered By: gvee

Without the aid or re you could just do this:

my_list = ['100,30 Kč','10 000,00 Kč', '10,000.00 Kč', '10000 Kč', '32.100,30 Kč', '12.345,678.91 Kč']

def fix(s):
    r = []
    for c in s:
        if c in '0123456789':
            r.append(c)
        elif c == ',':
            r.append('.')
        elif not c in '. ':
            break
    return float(''.join(r))

for n in my_list:
    print(fix(n))

Output:

100.3
10000.0
10.0
10000.0
32100.3
12345.67891
Answered By: Cobra
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.