Sum of Square Differences (SSD) in numpy/scipy

Question:

I’m trying to use Python and Numpy/Scipy to implement an image processing algorithm. The profiler tells me a lot of time is being spent in the following function (called often), which tells me the sum of square differences between two images

def ssd(A,B):
    s = 0
    for i in range(3):
        s += sum(pow(A[:,:,i] - B[:,:,i],2))
    return s

How can I speed this up? Thanks.

Asked By: Internet man

||

Answers:

Just

s = numpy.sum((A[:,:,0:3]-B[:,:,0:3])**2)

(which I expect is likely just sum((A-B)**2) if the shape is always (,,3))

You can also use the sum method: ((A-B)**2).sum()

Right?

Answered By: Andrew Jaffe

I do not know if the pow() function with power 2 will be fast. Try:

def ssd(A,B):
    s = 0
    for i in  range(3):
        s += sum((A[:,:,i] - B[:,:,i])*(A[:,:,i] - B[:,:,I]))
    return s
Answered By: Ritsaert Hornstra

I am confused why you are taking i in range(3). Is that supposed to be the whole array, or just part?

Overall, you can replace most of this with operations defined in numpy:

def ssd(A,B):
    squares = (A[:,:,:3] - B[:,:,:3]) ** 2
    return numpy.sum(squares)

This way you can do one operation instead of three and using numpy.sum may be able to optimize the addition better than the builtin sum.

Answered By: Mike Graham

Further to Ritsaert Hornstra’s answer that got 2 negative marks (admittedly I didn’t see it in it’s original form…)

This is actually true.

For a large number of iterations it can often take twice as long to use the ‘**’ operator or the pow(x,y) method as to just manually multiply the pairs together. If necessary use the math.fabs() method if it’s throwing out NaN’s (which it sometimes does especially when using int16s etc.), and it still only takes approximately half the time of the two functions given.

Not that important to the original question I know, but definitely worth knowing.

Answered By: Duncan Tait

In Ruby language you can achieve this in this way

def diff_btw_sum_of_squars_and_squar_of_sum(from=1,to=100) # use default values from 1..100. 
((1..100).inject(:+)**2) -(1..100).map {|num| num ** 2}.inject(:+)
end

diff_btw_sum_of_squars_and_squar_of_sum #call for above method
Answered By: Raza Hussain

You can try this one:

dist_sq = np.sum((A[:, np.newaxis, :] - B[np.newaxis, :, :]) ** 2, axis=-1)

More details can be found here (the ‘k-Nearest Neighbors’ example):
https://jakevdp.github.io/PythonDataScienceHandbook/02.08-sorting.html

Answered By: Xiaoyan Zhuo

Just to mention that one can also use np.dot:

def ssd(A,B):
  dif = A.ravel() - B.ravel()
  return np.dot( dif, dif )

This might be a bit faster and possibly more accurate than alternatives using np.sum and **2, but doesn’t work if you want to compute ssd along a specified axis. In that case, there might be a magical subscript formula using np.einsum.

Answered By: Jonathan H