# How to maintain decimals when dividing with numpy arrays in Python

## Question:

So, I was working on implementing my own version of the Statsitical Test of Homogeneity in Python where the user would submit a list of lists and the fuction would compute the corresponding chi value.

One issue I found was that my function was removing decimals when performing division, resulting in a somewhat innaccurate chi value for small sample sizes.

Here is the code:

``````import numpy as np
import scipy.stats as stats

def test_of_homo(list1):
a = np.array(list1)
#n = a.size
num_rows = a.shape[0]
num_cols = a.shape[1]
dof = (num_cols-1)*(num_rows-1)
column_totals = np.sum(a, axis=0)
row_totals = np.sum(a, axis=1)
n = sum(row_totals)
b = np.array(list1)
c = 0
for x in range(num_rows):
for y in range(num_cols):
print("X is " + str(x))
print("Y is " + str(y))
print("a[x][y] is " + str(a[x][y]))
print("row_totals[x] is " + str(row_totals[x]))
print("column_total[y] is " + str(column_totals[y]))
b[x][y] = (float(row_totals[x])*float(column_totals[y]))/float(n)
print("b[x][y] is " + str(b[x][y]))
numerator = ((a[x][y]) - b[x][y])**2
chi =  float(numerator)/float(b[x][y])
c = float(c)+ float(chi)
print(b)
print(c)
print(stats.chi2.cdf(c, df=dof))
print(1-(stats.chi2.cdf(c, df=dof)))

listc = [(21, 36, 30), (48, 26, 19)]

test_of_homo(listc)
``````

When the resulted were printed I saw that the `b[x][y]` values were `[[33 29 23] [35 32 25]]` instead of like `33.35, 29.97, 23.68` etc. This caused my resulting chi value to be 15.58 with a p of 0.0004 instead of the expected 14.5.

I tried to convert everything to float but that didn’t seem to work. Using the `decimal.Decimal(b[x][y])` resulted in a type error. Any help?

I think the problem could be due to the numbers you are providing to the function in the list. Note that if you convert a list to a Numpy array without specifying the data type it will try to guess based on the values:

``````>>> listc = [(21, 36, 30), (48, 26, 19)]
>>> a = np.array(listc)
>>> a.dtype
dtype('int64')
``````

Here is how you force conversion to a desired data type:

``````>>> a = np.array(listc, dtype=float)
>>> a.dtype
dtype('float64')
``````

Try that in the first and 9th lines of your function and see if it solves the problem. If you do this you shouldn’t need to use `float()` all the time.

Categories: questions
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.