# How to prevent Floating-Point errors with Pandas

## Question:

I have a problem with my Python code. I’m using pandas to read a Dataset and store it in a Data Frame. I’m now trying to convert ug to mg (1000ug == 1 mg) and g to mg (1000 mg == 1g).

I’m first converting the Datatype of the column to `float64`

```
df[data_column] = df[data_column].astype("float64")
```

After that am, I’m selecting all the rows that contain values `ug`

and multiplying them by `0.0001`

and then the rows with `g`

multiplying them with 1000

```
df.loc[df[unit_colum] == "g", [data_column]] *= 1000
df.loc[df[unit_colum] == "ug", [data_column]] *= 0.001
```

Btw:

I know that I also can devide values in pandas but this code should at the end run in a Loop where it also converts other values like (l -> ml).

My question now is:

Is there any chance that a Floating-Point error occures and what is the best way to prevent it.

I already thought about not converting the Data Frame columns into float64 and just work with the Strings. But this isn’t my prefered way.

## Answers:

It is difficult to fully avoid floating point errors in general.

You have two major options to avoid/limit them:

- perform your computations in the smallest available unit (here µg) as
**integers** - round the values to the desired precision after conversion

Also, a tip for your conversion, rather than using multiple lines you can `map`

the factors:

```
factors = {'ug': 0.001, 'g': 1000, 'mg': 1}
df['data_column'] *= df['unit_column'].map(factors)
```

Going for integers in a known unit is certainly a good option with easy to understand error bounds and good performance. It’s effectively the same as using floating point with an absolute error threshold.

You can also switch to fractions. This should be done starting with the conversion from strings since it avoids all floating point effects. In particular `Fraction("0.01") != Fraction(0.01)`

but `Fraction("0.01") == Fraction("0.1") / Fraction(10)`

This should work:

```
df[data_column] = df[data_column].map(fractions.Fraction)
df.loc[df[unit_colum] == "g", [data_column]] *= fractions.Fraction(1000)
df.loc[df[unit_colum] == "ug", [data_column]] *= fractions.Fraction(1, 1000)
```