Numbers in three different formats

Question:

I am working with a dataset; below you can see a small example.

import pandas as pd
import numpy as np

data = {
         'id':['9.','09', 9],
        }
df = pd.DataFrame(data, columns = [
                                      'id',])

df['id'] = df['id'].replace(".","")


df

In this data set, one number is written in three ways e.g '9.','09', 9
Now I want to have all of these numbers written in the same way e.g 9, without . or 0

I tried with this line of code but it is not working

df['id'] = df['id'].replace(".","")

So can anybody help me how to solve this problem?

Asked By: silent_hunter

||

Answers:

try this:

pd.to_numeric(df['id'])
0    9.0
1    9.0
2    9.0
Name: id, dtype: float64

to int:

pd.to_numeric(df['id']).astype(int)
0    9
1    9
2    9
Name: id, dtype: int64
Answered By: ACA

You could turn them all into doubles and then into integers

df['id'].astype('double').astype(int)

But this can be a problem for large numbers that exceed the 53 bit significand of the double. If the errant period is always at the end of the string, you could do

df['id'].astype(str).str.strip('.').astype(int)

or even a regex that strips the dot plus anything else to its right.

Answered By: tdelaney
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.