What's the difference between np.divide(x, y) and x/y in Python3?
Question:
I recently found a bug in my code that I was able to fix by replacing np.divide(x, y)
with x / y
.
I was under the impression that np.divide(x, y) was equivalent to x / y (it says as much in the numpy documentation).
Is this a bug in numpy or is it expected behaviour?
As I said my immediate issue is solved so I’m not too worried about finding a fix, I am more curious to understand what’s going on.
import numpy as np
x1 = np.array([[281], [15831], [30280], [975], [313], [739], [252], [10364], [21480], [1447], [315], [772], [95], [2710], [7408], [215], [111], [158], [0], [88], [21], [661], [0], [0], [0], [5], [4], [0], [12], [0], [0], [50], [28], [0], [0], [272]])
x2 = np.array([[499], [6315], [33800], [580], [208], [464], [384], [3127], [19596], [2319], [218], [1740], [217], [411], [4250], [223], [406], [267], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [18], [0], [1], [0], [41], [0], [0], [0]])
x3 = np.array([[507], [6180], [34005], [555], [200], [451], [390], [3024], [19492], [2425], [211], [1848], [223], [396], [4097], [224], [406], [282], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [19], [0], [2], [0], [45], [0], [0], [0]])
x4 = np.array([[507], [6178], [34017], [554], [200], [451], [391], [3022], [19486], [2439], [210], [1865], [223], [396], [4089], [224], [406], [284], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [19], [0], [2], [0], [46], [0], [0], [0]])
not_zero = (x1 + x2) != 0
x = np.divide(2*(x1 - x2)**2, x1 + x2, where=not_zero)
r = (2*(x1[not_zero] - x2[not_zero])**2) / (x1[not_zero] + x2[not_zero])
print("n1 =",x.max(),"tt1 =", r.max())
not_zero = (x2 + x3) != 0
x = np.divide(2*(x2 - x3)**2, x2 + x3, where=not_zero)
r = (2*(x2[not_zero] - x3[not_zero])**2) / (x2[not_zero] + x3[not_zero])
print("n2 =",x.max(),"tt2 =", r.max())
not_zero = (x3 + x4) != 0
x = np.divide(2*(x3 - x4)**2, x3 + x4, where=not_zero)
r = (2*(x3[not_zero] - x4[not_zero])**2) / (x3[not_zero] + x4[not_zero])
print("n3 =",x.max(),"tt3 =", r.max())
Output:
n1 = 8177.933351395286 t1 = 8177.933351395286
n2 = 873842.0 t2 = 6.501672240802676
n3 = 1322.0 t3 = 0.15566927013196877
Python version: 3.7.6
Numpy version: 1.17.0
Answers:
The parameter where = mask
without the parameter out
is somewhat dangerous. Without a target for the output, the function builds an np.empty
array of the appropriate shape, and then replaces some subset of the empty array with the output data.
But np.empty
isn’t, well, empty. It’s just a random memory location that hasn’t been initialized (so it still has any garbage data that existed in that memory block before). So where mask = False
, your output will be that leftover random garbage. If that memory block happens to have binary garbage that can be encoded into a number bigger than the rest of your data, it end up being your max
value.
You can mask out the garbage using not_zero
as a mask again:
x = np.divide(2*(x1 - x2)**2, x1 + x2, where=not_zero)[not_zero]
or initialize your output array yourself:
x = np.zeros_like(x1)
np.divide(2*(x1 - x2)**2, x1 + x2, where = not_zero, out = x)
I recently found a bug in my code that I was able to fix by replacing np.divide(x, y)
with x / y
.
I was under the impression that np.divide(x, y) was equivalent to x / y (it says as much in the numpy documentation).
Is this a bug in numpy or is it expected behaviour?
As I said my immediate issue is solved so I’m not too worried about finding a fix, I am more curious to understand what’s going on.
import numpy as np
x1 = np.array([[281], [15831], [30280], [975], [313], [739], [252], [10364], [21480], [1447], [315], [772], [95], [2710], [7408], [215], [111], [158], [0], [88], [21], [661], [0], [0], [0], [5], [4], [0], [12], [0], [0], [50], [28], [0], [0], [272]])
x2 = np.array([[499], [6315], [33800], [580], [208], [464], [384], [3127], [19596], [2319], [218], [1740], [217], [411], [4250], [223], [406], [267], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [18], [0], [1], [0], [41], [0], [0], [0]])
x3 = np.array([[507], [6180], [34005], [555], [200], [451], [390], [3024], [19492], [2425], [211], [1848], [223], [396], [4097], [224], [406], [282], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [19], [0], [2], [0], [45], [0], [0], [0]])
x4 = np.array([[507], [6178], [34017], [554], [200], [451], [391], [3022], [19486], [2439], [210], [1865], [223], [396], [4089], [224], [406], [284], [2], [0], [16], [0], [0], [0], [0], [8], [3], [0], [19], [0], [2], [0], [46], [0], [0], [0]])
not_zero = (x1 + x2) != 0
x = np.divide(2*(x1 - x2)**2, x1 + x2, where=not_zero)
r = (2*(x1[not_zero] - x2[not_zero])**2) / (x1[not_zero] + x2[not_zero])
print("n1 =",x.max(),"tt1 =", r.max())
not_zero = (x2 + x3) != 0
x = np.divide(2*(x2 - x3)**2, x2 + x3, where=not_zero)
r = (2*(x2[not_zero] - x3[not_zero])**2) / (x2[not_zero] + x3[not_zero])
print("n2 =",x.max(),"tt2 =", r.max())
not_zero = (x3 + x4) != 0
x = np.divide(2*(x3 - x4)**2, x3 + x4, where=not_zero)
r = (2*(x3[not_zero] - x4[not_zero])**2) / (x3[not_zero] + x4[not_zero])
print("n3 =",x.max(),"tt3 =", r.max())
Output:
n1 = 8177.933351395286 t1 = 8177.933351395286
n2 = 873842.0 t2 = 6.501672240802676
n3 = 1322.0 t3 = 0.15566927013196877
Python version: 3.7.6
Numpy version: 1.17.0
The parameter where = mask
without the parameter out
is somewhat dangerous. Without a target for the output, the function builds an np.empty
array of the appropriate shape, and then replaces some subset of the empty array with the output data.
But np.empty
isn’t, well, empty. It’s just a random memory location that hasn’t been initialized (so it still has any garbage data that existed in that memory block before). So where mask = False
, your output will be that leftover random garbage. If that memory block happens to have binary garbage that can be encoded into a number bigger than the rest of your data, it end up being your max
value.
You can mask out the garbage using not_zero
as a mask again:
x = np.divide(2*(x1 - x2)**2, x1 + x2, where=not_zero)[not_zero]
or initialize your output array yourself:
x = np.zeros_like(x1)
np.divide(2*(x1 - x2)**2, x1 + x2, where = not_zero, out = x)