ValueError when try to convert a negative number from csv file to integer

Question:

Hello i have the csv file below, imported with pandas data = pd.read_csv("1.csv"):

x1,x2,xb,y
−2,1,1,1

I need to convert the negative number (-2) to integer with int(), but i get ValueError:

print(data.iloc[1-1]['x1']) 
> -2 # str

print(int(data.iloc[1-1]['x1']))
> ValueError: invalid literal for int() with base 10: '−2`

I haven’t the error when try to convert positive number:

print(data.iloc[1-1]['x2'])
> 1 # str
print(int(data.iloc[1-1]['x2']))
> 1 # int
Asked By: gelerum

||

Answers:

The "−" within "−2" is not a proper minus sign, looks like it but is not the same.

Your print would work like this:

print(int(data.iloc[1-1]['x1'].replace("−", "-")))

And if you don’t want to replace the problematic minus signs with the correct ones one by one, you could do this operation on the whole column.

data['x1'] = data['x1'].str.replace("−", "-").astype("int")
Answered By: Csaba

The problem is that many unicode characters look like a minus sign…

The character that you are showing in your question is U+2212 MINUS SIGN. The character that is used for negative numbers is the ASCII U+002D HYPHEN-MINUS. While the print the same, they are different characters. You will have to clean up your data file…

Answered By: Serge Ballesta
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.