How to convert negative strings in float numbers in pandas?

Question:

I have a series of negative strings in my dataset. I’d like to convert them into negative floats, but get the ValueError: could not convert string to float: '-'. I suppose there is a problem with the enconding format, so I tried to replace - with the Unicode - hyphen, but got the same error anyway.

I’ve tried to replace every possible Unicode code with a normal hyphen, but it didn’t work.

I use Python 3.8.1 and pandas 1.0.2.

Are there any workarounds?

P.S. There is a similar question here, but it didn’t help.

Here what I’ve done:
The dataset is here. It’s called ‘1240K+HO’, extension .anno.

Then:

# open file
df = pd.read_table('v42.4.1240K_HO.anno', index_col=0, usecols=['Index', 
                                                                'Instance ID',
                                                                'Master ID', 
                                                                'Average of 95.4% date range in calBP (defined as 1950 CE)',
                                                                'Country',
                                                                'Lat.',
                                                                'Long.'],
                   na_values='..')

Then I try to convert strings in ‘Lat.’ column to float numbers.

# convert strings to floats
df['Lat.'] = df['Lat.'].astype(float)
Asked By: Arthurio

||

Answers:

The issue is that there is at least one '-' value. That’s it, just a hyphen with no figure after it.

You can do this:

import numpy as np

df['Lat.'] = df['Lat.'].replace('-',np.nan)

Then this will work:

df['Lat.'] = df['Lat.'].astype(float)
Answered By: mechanical_meat

in case you still get an error you can use pd.to_numeric with coerce to convert non-numeric elements to NaN. you can then get convert all NaN to 0 or whatever you wish from there

import pandas as pd

df['Lat.'] = pd.to_numeric(df['Lat.'],errors='coerce')
Answered By: JC23
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.