For each cyclic shift of num1, calculate the sum of absolute differences with num2

Question:

num1 = [1, 4, 2, 11] 

num2  = [10, 1, 8, 4]

The output should be solution(arr1, arr2) = 7.

Explanation:

Let’s consider all possible cyclic shifts of num1:

  1. The first shift is [1, 4, 2, 11], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |1 – 10| + |4 – 1| + |2 – 8| + |11 – 4| = 9 + 3 + 6 + 7 = 25.

  2. The second shift is [4, 2, 11, 1], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |4 – 10| + |2 – 1| + |11 – 8| + |1 – 4| = 6 + 1 + 3 + 3 = 13.

  3. The third shift is [2, 11, 1, 4], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |2 – 10| + |11 – 1| + |1 – 8| + |4 – 4| = 8 + 10 + 7 + 0 = 25.

  4. The fourth shift is [11, 1, 4, 2], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |11 – 10| + |1 – 1| + |4 – 8| + |2 – 4| = 1 + 0 + 4 + 2 = 7.
    The lowest sum among all of the above is 7, which is the answer.

I have written the below code which has O(n2) - Time Complexity.

nums1 = [1, 4, 2, 11]
nums2 = [10, 1, 8, 4]
         
def find_min_sum(nums1, nums2):
        import heapq as hq
        
        heap = []
        
        for i in range(len(nums1)):
            sum, j = 0, 0
            
            while j < len(nums1):
                diff = abs(nums1[j]-nums2[j])
                sum += diff
                j += 1
            hq.heappush(heap, (sum, j-1, nums1[j-1]))
            
            nums1.append(nums1.pop(0))
            
        return hq.heappop(heap)[0]
                
    
find_min_sum(nums1, nums2)

How the above code can be improved to reduce "overall complexity"?

Asked By: meallhour

||

Answers:

Not sure about the complexity but this is more concise:

num1 = [1, 4, 2, 11] 

num2  = [10, 1, 8, 4]

a = []

for _ in range(len(num1)):
    a.append(sum(abs(x-y) for x, y in zip(num1, num2)))
    num1 = num1[1:] + [num1[0]]

print(min(a))

Output:

7
Answered By: Vlad

This answer has evolved as new insights were gained.
Each section (separated by a <hr>) represents a new step which invalidates the previous conclusions.


I could imagine that shifting the array is more expensive than just shifting one index (but you would have to actually compare the timing):

def find_min_sum_abs_diff_by_index_shift(nums1, nums2):
    l = len(nums1)
    assert l == len(nums2)
    min_sum_abs_diff = None
    for index_shift in range(l):
        sum_abs_diff = sum(
            abs(nums1[i] - nums2[(i + index_shift) % l]) for i in range(l)
        )
        if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
            min_sum_abs_diff = sum_abs_diff
    return min_sum_abs_diff

Comparing the timing with this corresponding solution that shifts the arrays:

def find_min_sum_abs_diff_by_array_shift(nums1, nums2):
    l = len(nums1)
    assert l == len(nums2)
    min_sum_abs_diff = None
    for _index_shift in range(l):
        sum_abs_diff = sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
        nums1 = nums1[1:] + [nums1[0]]  # (!) `nums1.append(nums1.pop(0))` is more efficient!
        if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
            min_sum_abs_diff = sum_abs_diff
    return min_sum_abs_diff

has shown that find_min_sum_abs_diff_by_index_shift is (for me using Python 3.10.5) actually slower by ~70% than find_min_sum_abs_diff_by_array_shift (measured for l = 4000).

As I still think that shifting the array could be avoided, I came up with another solution using a generator that provides the elements of the shifted array without shifting the array itself:

def find_min_sum_abs_diff_by_shift_generator(nums1, nums2):
    def shifted_array(array, index_shift):
        for element in array[index_shift:]:
            yield element
        for element in array[:index_shift]:
            yield element

    l = len(nums1)
    assert l == len(nums2)
    min_sum_abs_diff = None
    for index_shift in range(l):
        sum_abs_diff = sum(
            abs(n1 - n2) for n1, n2 in zip(nums1, shifted_array(nums2, index_shift))
        )
        if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
            min_sum_abs_diff = sum_abs_diff
    return min_sum_abs_diff

find_min_sum_abs_diff_by_shift_generator initially seemed to be faster than find_min_sum_abs_diff_by_array_shift by ~5%.


As pointed out in the comments (thanks!), my first version of find_min_sum_abs_diff_by_shift_generator still contained a bug (which wasn’t caught by the test case for nums1 = [1, 4, 2, 11] and nums2 = [10, 1, 8, 4]). Fixing that bug makes it again slower than find_min_sum_abs_diff_by_array_shift.

Thanks to @Kelly Bundy, find_min_sum_abs_diff_by_array_shift could even be sped up further by replacing nums1 = nums1[1:] + [nums1[0]] with nums1.append(nums1.pop(0)).

Summary

After an exciting journey of back and forth, the fastest solution currently known to me is:

def find_min_sum_abs_diff_by_array_shift_optimized_by_Kelly_Bundy(nums1, nums2):
    l = len(nums1)
    assert l == len(nums2)
    min_sum_abs_diff = None
    for _index_shift in range(l):
        sum_abs_diff = sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
        nums1.append(nums1.pop(0))
        if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
            min_sum_abs_diff = sum_abs_diff
    return min_sum_abs_diff

The relevant improvement was to calculate the sum via sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2)) instead of the manual loop (which even evaluates len(nums1) for every iteration):

            sum, j = 0, 0
            while j < len(nums1):
                diff = abs(nums1[j]-nums2[j])
                sum += diff
                j += 1

What’s the take away learning?

Never blindly "optimize" based on your intuition (which was "There must be a faster way which avoids mutating the array!" in my case), but always measure the actual effect in the relevant environment.

Answered By: Manfred

This still has O(n²) complexity, but uses the vectorial capabilities of :

import numpy as np
from numpy.lib.stride_tricks import sliding_window_view

out = abs(sliding_window_view(np.tile(num2, 2)[:-1], len(num2))-num1).sum(1).min()

output: 7

Answered By: mozway