Algorithm to find least sum of squares of differences

Question

Basically this algorithm I’m writing takes as input a List L and wants to find a number x such that all items in L, i, minus x squared and summed are minimized. Find minimum x for the sum of abs(L[i]-x)**2. So far my algorithm is doing what it’s supposed to, just not in the cases of floating. I’m not sure how to implement floating. For example [2, 2, 3, 4] ideally would yield the result 2.75, but my algorithm isn’t currently capable of yielding floating integers.

 def minimize_square(L):
     sumsqdiff = 0
     sumsqdiffs = {}
     for j in range(min(L), max(L)):
             for i in range(len(L)-1):
                     sumsqdiff += abs(L[i]-j)**2
             sumsqdiffs[j]=sumsqdiff
             sumsqdiff = 0
     return min(sumsqdiffs, key=sumsqdiffs.get)

Asked By: madman2890

||

Source

Answer 1

It is easy to prove [*] that the number that minimizes the sum of squared differences is the arithmetic mean of L. This gives the following simple solution:

In [26]: L = [2, 2, 3, 4]

In [27]: sum(L) / float(len(L))
Out[27]: 2.75

or, using NumPy:

In [28]: numpy.mean(L)
Out[28]: 2.75

[*] Here is an outline of the proof:

We need to find x that minimizes f(x) = sum((x - L[i])**2) where the sum is taken over i=0..n-1.

Take the derivative of f(x) and set it to zero:

2*sum(x - L[i]) = 0

Using simple algebra, the above can be transformed into

x = sum(L[i]) / n

which is none other than the arithmetic mean of L. QED.

Answered By: NPE

Answer 2

I am not 100% sure this is the most efficient way to do this but what you could do is mantain the same algorithm that you have and modify the return statement.

min_int = min(sumsqdiffs, key=sumsqdiffs.get)
return bisection(L,min_int-1,min_int+1)

where bisection implement the following method: Bisection Method

This works iff there is a single minimum for the function in the analyzed interval.

Answered By: igon

Algorithm to find least sum of squares of differences

Question:

Answers: