How to convert this Series of string to a Series of numbers

Question:

I have a pandas series as followed:

series = pd.Series(['', '', '$ 200,000', '$ -320,000','$ 700,000','','$ -10,000']) 

I need to convert all string values in the series to integers or float by removing '$ ' and commas. I have tried the following:

for i in series:
    if '$ ' in i:
        i=i.replace('$ ',"")
series = pd.to_numeric(series, errors='coerce')

However, the replace method doesn’t change the series. I’m not yet sure if pd.to_numeric method is gonna help me concert the series to numbers or not.

Asked By: Alice_inwonderland

||

Answers:

replace and to_numeric

  • Remove anything that isn’t a digit, period/decimal point, or minus sign.
  • Using pandas.Series.str.replace defaults to using regex

pd.to_numeric(series.str.replace('[^0-9.-]', ''))

0         NaN
1         NaN
2    200000.0
3   -320000.0
4    700000.0
5         NaN
6    -10000.0
dtype: float64

Equivalently: pandas.Series.replace has to be told that using regex is needed.

pd.to_numeric(series.replace('[^0-9.-]', '', regex=True))
Answered By: piRSquared
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.