For each cyclic shift of num1, calculate the sum of absolute differences with num2
Question:
num1 = [1, 4, 2, 11]
num2 = [10, 1, 8, 4]
The output should be solution(arr1, arr2) = 7.
Explanation:
Let’s consider all possible cyclic shifts of num1:
-
The first shift is [1, 4, 2, 11], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |1 – 10| + |4 – 1| + |2 – 8| + |11 – 4| = 9 + 3 + 6 + 7 = 25.
-
The second shift is [4, 2, 11, 1], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |4 – 10| + |2 – 1| + |11 – 8| + |1 – 4| = 6 + 1 + 3 + 3 = 13.
-
The third shift is [2, 11, 1, 4], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |2 – 10| + |11 – 1| + |1 – 8| + |4 – 4| = 8 + 10 + 7 + 0 = 25.
-
The fourth shift is [11, 1, 4, 2], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |11 – 10| + |1 – 1| + |4 – 8| + |2 – 4| = 1 + 0 + 4 + 2 = 7.
The lowest sum among all of the above is 7, which is the answer.
I have written the below code which has O(n2) - Time Complexity.
nums1 = [1, 4, 2, 11]
nums2 = [10, 1, 8, 4]
def find_min_sum(nums1, nums2):
import heapq as hq
heap = []
for i in range(len(nums1)):
sum, j = 0, 0
while j < len(nums1):
diff = abs(nums1[j]-nums2[j])
sum += diff
j += 1
hq.heappush(heap, (sum, j-1, nums1[j-1]))
nums1.append(nums1.pop(0))
return hq.heappop(heap)[0]
find_min_sum(nums1, nums2)
How the above code can be improved to reduce "overall complexity"?
Answers:
Not sure about the complexity but this is more concise:
num1 = [1, 4, 2, 11]
num2 = [10, 1, 8, 4]
a = []
for _ in range(len(num1)):
a.append(sum(abs(x-y) for x, y in zip(num1, num2)))
num1 = num1[1:] + [num1[0]]
print(min(a))
Output:
7
This answer has evolved as new insights were gained.
Each section (separated by a <hr>
) represents a new step which invalidates the previous conclusions.
I could imagine that shifting the array is more expensive than just shifting one index (but you would have to actually compare the timing):
def find_min_sum_abs_diff_by_index_shift(nums1, nums2):
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for index_shift in range(l):
sum_abs_diff = sum(
abs(nums1[i] - nums2[(i + index_shift) % l]) for i in range(l)
)
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
Comparing the timing with this corresponding solution that shifts the arrays:
def find_min_sum_abs_diff_by_array_shift(nums1, nums2):
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for _index_shift in range(l):
sum_abs_diff = sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
nums1 = nums1[1:] + [nums1[0]] # (!) `nums1.append(nums1.pop(0))` is more efficient!
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
has shown that find_min_sum_abs_diff_by_index_shift
is (for me using Python 3.10.5) actually slower by ~70% than find_min_sum_abs_diff_by_array_shift
(measured for l = 4000
).
As I still think that shifting the array could be avoided, I came up with another solution using a generator that provides the elements of the shifted array without shifting the array itself:
def find_min_sum_abs_diff_by_shift_generator(nums1, nums2):
def shifted_array(array, index_shift):
for element in array[index_shift:]:
yield element
for element in array[:index_shift]:
yield element
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for index_shift in range(l):
sum_abs_diff = sum(
abs(n1 - n2) for n1, n2 in zip(nums1, shifted_array(nums2, index_shift))
)
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
find_min_sum_abs_diff_by_shift_generator
initially seemed to be faster than find_min_sum_abs_diff_by_array_shift
by ~5%.
As pointed out in the comments (thanks!), my first version of find_min_sum_abs_diff_by_shift_generator
still contained a bug (which wasn’t caught by the test case for nums1 = [1, 4, 2, 11]
and nums2 = [10, 1, 8, 4]
). Fixing that bug makes it again slower than find_min_sum_abs_diff_by_array_shift
.
Thanks to @Kelly Bundy, find_min_sum_abs_diff_by_array_shift
could even be sped up further by replacing nums1 = nums1[1:] + [nums1[0]]
with nums1.append(nums1.pop(0))
.
Summary
After an exciting journey of back and forth, the fastest solution currently known to me is:
def find_min_sum_abs_diff_by_array_shift_optimized_by_Kelly_Bundy(nums1, nums2):
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for _index_shift in range(l):
sum_abs_diff = sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
nums1.append(nums1.pop(0))
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
The relevant improvement was to calculate the sum via sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
instead of the manual loop (which even evaluates len(nums1)
for every iteration):
sum, j = 0, 0
while j < len(nums1):
diff = abs(nums1[j]-nums2[j])
sum += diff
j += 1
What’s the take away learning?
Never blindly "optimize" based on your intuition (which was "There must be a faster way which avoids mutating the array!" in my case), but always measure the actual effect in the relevant environment.
This still has O(n²) complexity, but uses the vectorial capabilities of numpy:
import numpy as np
from numpy.lib.stride_tricks import sliding_window_view
out = abs(sliding_window_view(np.tile(num2, 2)[:-1], len(num2))-num1).sum(1).min()
output: 7
num1 = [1, 4, 2, 11]
num2 = [10, 1, 8, 4]
The output should be solution(arr1, arr2) = 7.
Explanation:
Let’s consider all possible cyclic shifts of num1:
-
The first shift is [1, 4, 2, 11], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |1 – 10| + |4 – 1| + |2 – 8| + |11 – 4| = 9 + 3 + 6 + 7 = 25.
-
The second shift is [4, 2, 11, 1], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |4 – 10| + |2 – 1| + |11 – 8| + |1 – 4| = 6 + 1 + 3 + 3 = 13.
-
The third shift is [2, 11, 1, 4], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |2 – 10| + |11 – 1| + |1 – 8| + |4 – 4| = 8 + 10 + 7 + 0 = 25.
-
The fourth shift is [11, 1, 4, 2], and the sum of absolute differences with arr2 = [10, 1, 8, 4] equals to |11 – 10| + |1 – 1| + |4 – 8| + |2 – 4| = 1 + 0 + 4 + 2 = 7.
The lowest sum among all of the above is 7, which is the answer.
I have written the below code which has O(n2) - Time Complexity.
nums1 = [1, 4, 2, 11]
nums2 = [10, 1, 8, 4]
def find_min_sum(nums1, nums2):
import heapq as hq
heap = []
for i in range(len(nums1)):
sum, j = 0, 0
while j < len(nums1):
diff = abs(nums1[j]-nums2[j])
sum += diff
j += 1
hq.heappush(heap, (sum, j-1, nums1[j-1]))
nums1.append(nums1.pop(0))
return hq.heappop(heap)[0]
find_min_sum(nums1, nums2)
How the above code can be improved to reduce "overall complexity"?
Not sure about the complexity but this is more concise:
num1 = [1, 4, 2, 11]
num2 = [10, 1, 8, 4]
a = []
for _ in range(len(num1)):
a.append(sum(abs(x-y) for x, y in zip(num1, num2)))
num1 = num1[1:] + [num1[0]]
print(min(a))
Output:
7
This answer has evolved as new insights were gained.
Each section (separated by a <hr>
) represents a new step which invalidates the previous conclusions.
I could imagine that shifting the array is more expensive than just shifting one index (but you would have to actually compare the timing):
def find_min_sum_abs_diff_by_index_shift(nums1, nums2):
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for index_shift in range(l):
sum_abs_diff = sum(
abs(nums1[i] - nums2[(i + index_shift) % l]) for i in range(l)
)
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
Comparing the timing with this corresponding solution that shifts the arrays:
def find_min_sum_abs_diff_by_array_shift(nums1, nums2):
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for _index_shift in range(l):
sum_abs_diff = sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
nums1 = nums1[1:] + [nums1[0]] # (!) `nums1.append(nums1.pop(0))` is more efficient!
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
has shown that find_min_sum_abs_diff_by_index_shift
is (for me using Python 3.10.5) actually slower by ~70% than find_min_sum_abs_diff_by_array_shift
(measured for l = 4000
).
As I still think that shifting the array could be avoided, I came up with another solution using a generator that provides the elements of the shifted array without shifting the array itself:
def find_min_sum_abs_diff_by_shift_generator(nums1, nums2):
def shifted_array(array, index_shift):
for element in array[index_shift:]:
yield element
for element in array[:index_shift]:
yield element
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for index_shift in range(l):
sum_abs_diff = sum(
abs(n1 - n2) for n1, n2 in zip(nums1, shifted_array(nums2, index_shift))
)
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
find_min_sum_abs_diff_by_shift_generator
initially seemed to be faster than find_min_sum_abs_diff_by_array_shift
by ~5%.
As pointed out in the comments (thanks!), my first version of find_min_sum_abs_diff_by_shift_generator
still contained a bug (which wasn’t caught by the test case for nums1 = [1, 4, 2, 11]
and nums2 = [10, 1, 8, 4]
). Fixing that bug makes it again slower than find_min_sum_abs_diff_by_array_shift
.
Thanks to @Kelly Bundy, find_min_sum_abs_diff_by_array_shift
could even be sped up further by replacing nums1 = nums1[1:] + [nums1[0]]
with nums1.append(nums1.pop(0))
.
Summary
After an exciting journey of back and forth, the fastest solution currently known to me is:
def find_min_sum_abs_diff_by_array_shift_optimized_by_Kelly_Bundy(nums1, nums2):
l = len(nums1)
assert l == len(nums2)
min_sum_abs_diff = None
for _index_shift in range(l):
sum_abs_diff = sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
nums1.append(nums1.pop(0))
if min_sum_abs_diff is None or sum_abs_diff < min_sum_abs_diff:
min_sum_abs_diff = sum_abs_diff
return min_sum_abs_diff
The relevant improvement was to calculate the sum via sum(abs(n1 - n2) for n1, n2 in zip(nums1, nums2))
instead of the manual loop (which even evaluates len(nums1)
for every iteration):
sum, j = 0, 0
while j < len(nums1):
diff = abs(nums1[j]-nums2[j])
sum += diff
j += 1
What’s the take away learning?
Never blindly "optimize" based on your intuition (which was "There must be a faster way which avoids mutating the array!" in my case), but always measure the actual effect in the relevant environment.
This still has O(n²) complexity, but uses the vectorial capabilities of numpy:
import numpy as np
from numpy.lib.stride_tricks import sliding_window_view
out = abs(sliding_window_view(np.tile(num2, 2)[:-1], len(num2))-num1).sum(1).min()
output: 7