Suppressing scientific notation in pandas?
Question:
I have a DataFrame in pandas where some of the numbers are expressed in scientific notation (or exponent notation) like this:
id value
id 1.00 -4.22e-01
value -0.42 1.00e+00
percent -0.72 1.00e-01
played 0.03 -4.35e-02
money -0.22 3.37e-01
other NaN NaN
sy -0.03 2.19e-04
sz -0.33 3.83e-01
And the scientific notation makes what should be an easy comparison, needlessly difficult. I assume it’s the 21900 value that’s screwing it up for the others. I mean 1.0 is encoded. ONE!
This doesn’t work:
np.set_printoptions(supress=True)
And pandas.set_printoptions
doesn’t implement suppress either, and I’ve looked all at pd.describe_options()
in despair, and pd.core.format.set_eng_float_format()
only seems to turn it on for all the other float values, with no ability to turn it off.
Answers:
Your data is probably object
dtype. This is a direct copy/paste of your data. read_csv
interprets it as the correct dtype. You should normally only have object
dtype on string-like fields.
In [5]: df = read_csv(StringIO(data),sep='s+')
In [6]: df
Out[6]:
id value
id 1.00 -0.422000
value -0.42 1.000000
percent -0.72 0.100000
played 0.03 -0.043500
money -0.22 0.337000
other NaN NaN
sy -0.03 0.000219
sz -0.33 0.383000
check if your dtypes are object
In [7]: df.dtypes
Out[7]:
id float64
value float64
dtype: object
This converts this frame to object
dtype (notice the printing is funny now)
In [8]: df.astype(object)
Out[8]:
id value
id 1 -0.422
value -0.42 1
percent -0.72 0.1
played 0.03 -0.0435
money -0.22 0.337
other NaN NaN
sy -0.03 0.000219
sz -0.33 0.383
This is how to convert it back (astype(float)
) also works here
In [9]: df.astype(object).convert_objects()
Out[9]:
id value
id 1.00 -0.422000
value -0.42 1.000000
percent -0.72 0.100000
played 0.03 -0.043500
money -0.22 0.337000
other NaN NaN
sy -0.03 0.000219
sz -0.33 0.383000
This is what an object
dtype frame would look like
In [10]: df.astype(object).dtypes
Out[10]:
id object
value object
dtype: object
quick temporary: df.round(4)
global: pd.options.display.float_format = '{:20,.2f}'.format
If you would like to use the values as formated string in a list, say as part of csvfile csv.writier, the numbers can be formated before creating a list:
df['label'].apply(lambda x: '%.17f' % x).values.tolist()
Try this which will give you scientific notation only for large and very small values (and adds a thousands separator unless you omit the ","):
pd.set_option('display.float_format', lambda x: '%,g' % x)
Or to almost completely suppress scientific notation without losing precision, try this:
pd.set_option('display.float_format', str)
quick fix without rounding:
pd.options.display.float_format = '{:.0f}'.format
I tried all the options like
- pd.options.display.float_format = ‘{:.4f}’.format
- pd.set_option(‘display.float_format’, str)
- pd.set_option(‘display.float_format’, lambda x: f’%.{len(str(x%1))-2}f’ % x)
- pd.set_option(‘display.float_format’, lambda x: ‘%.3f’ % x)
but nothing worked for me.
so while assigning the variable / value (var1) to a variable (say num1) I used round(val,5).
num1 = round(var1,5)
This is a crude method as you have to use this round function in each assignment. But this ensures you control on how it happens and get what you want.
I have a DataFrame in pandas where some of the numbers are expressed in scientific notation (or exponent notation) like this:
id value
id 1.00 -4.22e-01
value -0.42 1.00e+00
percent -0.72 1.00e-01
played 0.03 -4.35e-02
money -0.22 3.37e-01
other NaN NaN
sy -0.03 2.19e-04
sz -0.33 3.83e-01
And the scientific notation makes what should be an easy comparison, needlessly difficult. I assume it’s the 21900 value that’s screwing it up for the others. I mean 1.0 is encoded. ONE!
This doesn’t work:
np.set_printoptions(supress=True)
And pandas.set_printoptions
doesn’t implement suppress either, and I’ve looked all at pd.describe_options()
in despair, and pd.core.format.set_eng_float_format()
only seems to turn it on for all the other float values, with no ability to turn it off.
Your data is probably object
dtype. This is a direct copy/paste of your data. read_csv
interprets it as the correct dtype. You should normally only have object
dtype on string-like fields.
In [5]: df = read_csv(StringIO(data),sep='s+')
In [6]: df
Out[6]:
id value
id 1.00 -0.422000
value -0.42 1.000000
percent -0.72 0.100000
played 0.03 -0.043500
money -0.22 0.337000
other NaN NaN
sy -0.03 0.000219
sz -0.33 0.383000
check if your dtypes are object
In [7]: df.dtypes
Out[7]:
id float64
value float64
dtype: object
This converts this frame to object
dtype (notice the printing is funny now)
In [8]: df.astype(object)
Out[8]:
id value
id 1 -0.422
value -0.42 1
percent -0.72 0.1
played 0.03 -0.0435
money -0.22 0.337
other NaN NaN
sy -0.03 0.000219
sz -0.33 0.383
This is how to convert it back (astype(float)
) also works here
In [9]: df.astype(object).convert_objects()
Out[9]:
id value
id 1.00 -0.422000
value -0.42 1.000000
percent -0.72 0.100000
played 0.03 -0.043500
money -0.22 0.337000
other NaN NaN
sy -0.03 0.000219
sz -0.33 0.383000
This is what an object
dtype frame would look like
In [10]: df.astype(object).dtypes
Out[10]:
id object
value object
dtype: object
quick temporary: df.round(4)
global: pd.options.display.float_format = '{:20,.2f}'.format
If you would like to use the values as formated string in a list, say as part of csvfile csv.writier, the numbers can be formated before creating a list:
df['label'].apply(lambda x: '%.17f' % x).values.tolist()
Try this which will give you scientific notation only for large and very small values (and adds a thousands separator unless you omit the ","):
pd.set_option('display.float_format', lambda x: '%,g' % x)
Or to almost completely suppress scientific notation without losing precision, try this:
pd.set_option('display.float_format', str)
quick fix without rounding:
pd.options.display.float_format = '{:.0f}'.format
I tried all the options like
- pd.options.display.float_format = ‘{:.4f}’.format
- pd.set_option(‘display.float_format’, str)
- pd.set_option(‘display.float_format’, lambda x: f’%.{len(str(x%1))-2}f’ % x)
- pd.set_option(‘display.float_format’, lambda x: ‘%.3f’ % x)
but nothing worked for me.
so while assigning the variable / value (var1) to a variable (say num1) I used round(val,5).
num1 = round(var1,5)
This is a crude method as you have to use this round function in each assignment. But this ensures you control on how it happens and get what you want.