Convert decimal mark when reading numbers as input

Question

I have a CSV file with data reading that I want to read into Python. I get lists that contain strings like "2,5". Now doing float("2,5") does not work, because it has the wrong decimal mark.

How do I read this into Python as 2.5?

Asked By: Till B

||

Source

Answer 1

float("2,5".replace(',', '.')) will do in most cases

If valueis a large number and .has been used for thousands, you can:

Replace all commas for points: value.replace(",", ".")

Remove all but the last point: value.replace(".", "", value.count(".") -1)

Answered By: eumiro

Answer 2

Try replacing all the decimal commas with decimal dots:

floatAsStr = "2,5"
floatAsStr = floatAsStr.replace(",", ".");
myFloat = float(floatAsStr)

The function replace, of course, work on any substring as python does now differentiate between char and string.

Answered By: penelope

Answer 3

using a regex will be more reliable

import re

decmark_reg = re.compile('(?<=d),(?=d)')

ss = 'abc , 2,5 def ,5,88 or (2,5, 8,12, 8945,3 )'

print ss
print decmark_reg.sub('.',ss)

result

abc , 2,5 def ,5,88 or (2,5, 8,12, 8945,3 )
abc , 2.5 def ,5.88 or (2.5, 8.12, 8945.3 )

If you want to treat more complex cases (numbers with no digit before the decimal mark for exemple) the regex I crafted to detect all types of numbers in the following thread may be of interest for you:

stackoverflow.com/questions/5917082/regular-expression-to-match-numbers-with-or-without-commas-and-decimals-in-text/5929469

Answered By: eyquem

Answer 4

You may do it the locale-aware way:

import locale

# Set to users preferred locale:
locale.setlocale(locale.LC_ALL, '')
# Or a specific locale:
locale.setlocale(locale.LC_NUMERIC, "en_DK.UTF-8")

print locale.atof("3,14")

Read this section before using this method.

Answered By: Lauritz V. Thaulow

Answer 5

Pandas supports this out of the box:

df = pd.read_csv(r'data.csv', decimal=',')

See http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

Answered By: maggie

Answer 6

First you must ensure what locale was used to provide the number. Failing to do this random problems surely will occur.

import locale

loc = locale.getlocale()  # get and save current locale
# use locale that provided the number;
# example if German locale was used:
locale.setlocale(locale.LC_ALL, 'de_DE')
pythonnumber = locale.atof(value)
locale.setlocale(locale.LC_ALL, loc)  # restore saved locale

Answered By: ilias iliadis

Answer 7

if dots are used as thousand separators, to swap commas and dots you could use a third symbol as temporary placeholder like so:

value.replace('.', '#').replace(',', '.').replace('#', ',')

but seeing as you want to convert to float from string, you could just remove any dots and then replace any commas with dots

float(value.replace('.', '').replace(',', '.'))

IMO this is the most readable solution

Answered By: teebagz

Convert decimal mark when reading numbers as input

Question:

Answers: