When would you use recursive merge sort over iterative?

Question:

Is there ever a situation where you should use a recursive merge sort over an iterative merge sort? Originally I thought that iterative approaches to merge sort were typically faster although I could not prove that in my own implementations. But recursion makes a lot more calls on the stack which in turn makes it less memory efficient. What if I have an extremely large dataset that I want to sort? Wouldn’t it be bad to use recursion then? Because doesn’t excessively deep recursion eventually lead to a stack overflow? Why would you ever use recursive over iterative if it is slower and less memory efficient?

def merge_sort(arr):
    if len(arr) <= 1:
        return arr

    current_size = 1
    while current_size < len(arr):
        left = 0
        while left < len(arr)-1:
            mid = left + current_size - 1
            right = min((left + 2*current_size - 1), (len(arr)-1))
            merged_arr = merge(arr[left : mid + 1], arr[mid + 1 : right + 1])
            for i in range(left, right + 1):
                arr[i] = merged_arr[i - left]
            left = left + current_size*2
        current_size = current_size * 2
    return arr

def merge(left, right):
    result = []
    i = 0
    j = 0
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    result += left[i:]
    result += right[j:]
    return result

def merge_sort_recursive(arr):
    if len(arr) <= 1:
        return arr
    mid = len(arr) // 2
    left = arr[:mid]
    right = arr[mid:]
    left = merge_sort_recursive(left)
    right = merge_sort_recursive(right)
    return merge(left, right)

def merge(left, right):
    result = []
    i = 0
    j = 0
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    result += left[i:]
    result += right[j:]
    return result
Asked By: ImNewAndLearning

||

Answers:

Update Hmm, after writing my own much simpler iterative one, I have to somewhat take back some of what I wrote…

def merge_sort_Kelly(arr):
    half = 1
    while half < len(arr):
        for mid in range(half, len(arr), 2*half):
            start = mid - half
            stop = mid + half
            arr[start:stop] = merge(arr[start:mid], arr[mid:stop])
        half *= 2
    return arr

Times for sorting three shuffled list(range(2**17)) (Try it online!):

1.35 seconds merge_sort
0.91 seconds merge_sort_recursive
0.90 seconds merge_sort_Kelly

1.25 seconds merge_sort
1.05 seconds merge_sort_recursive
0.92 seconds merge_sort_Kelly

1.34 seconds merge_sort
0.81 seconds merge_sort_recursive
0.88 seconds merge_sort_Kelly

It’s pretty much as fast and I’d say almost as simple as the recursive one. Even the boundary check for end was unnecessary after all, as Python slicing handles that for me. The imbalance issue remains.

About memory-efficiency: Actually your iterative one takes more memory than your recursive one, not less. Here are allocation peaks during sorting of list(range(2**17)) as measured with tracemalloc (Try it online!):

3,342,704 bytes  merge_sort
2,892,479 bytes  merge_sort_recursive
2,752,720 bytes  merge_sort_Kelly
  525,572 bytes  merge_sort_Kelly2 (see text below)

The peaks are reached during the final / top-level merge. Your iterative one takes more because when computing the final merged_arr, that variable still holds the previous one. Can be avoided with del merged_arr when it’s no longer needed. Then it only takes 2,752,832 bytes. And of course all our solutions could take less memory if we didn’t make so many slice copies but rather worked with indexes instead. That’s what merge_sort_Kelly2 does. It only copies in its merge function, and only copies one half out and then merges that half and the other half in the original list into the original list.

end of update, original answer:

Why would you ever use recursive over iterative

Mainly because it’s simpler/nicer. For example, your recursive one can sort [3, 1, 4] while your iterative one crashes with an IndexError. No doubt because it’s more complicated.

The recursive one is also more balanced, needing fewer comparisons. Left and right are always equally large or differ by just one element. For example, for arr = list(range(2**17)), both do 1114112 comparisons, because both are equally perfectly balanced. But with 2**17+1, the iterative one does 1245184 comparisons while the recursive one only does 1114113. Because the iterative one at the end merges 2^17 elements with 1 element (and that one element happens to be the largest).

I timed these two implementations and found iterative does in fact appear to be faster.

I get the opposite. Even for 2^17 elements, so that the iterative one doesn’t have the imbalance issue. Times for sorting three lists both ways:

1.23 seconds merge_sort
0.83 seconds merge_sort_recursive

1.25 seconds merge_sort
0.82 seconds merge_sort_recursive

1.19 seconds merge_sort
0.80 seconds merge_sort_recursive

Code:

from random import shuffle
from time import time

for _ in range(3):
    arr = list(range(2**17))
    shuffle(arr)
    for sort in merge_sort, merge_sort_recursive:
        copy = arr[:]
        t0 = time()
        copy = sort(copy)
        print(f'{time()-t0:.2f} seconds {sort.__name__}')
        assert copy == sorted(arr)
    print()
Answered By: Kelly Bundy

The iterative version of merge sort is generally faster than the recursive version, because the iterative version uses a loop, which is faster than function calls. Each function call in the recursive version adds a new layer to the call stack, which takes up memory and also takes time to set up and tear down. The iterative version does not have these overhead costs. So yes, you are right in thinking that recursive is less memory efficient I would say.

Additionally, the iterative version of merge sort can be easily parallelized, which can make it even faster on multi-core systems.

That being said, the difference in performance between the two implementations may not be significant for small arrays, but as the array grows larger, the iterative version will be faster.

As far as why you would use it, I would say in most situations that is based on preference. Some people have an easier time thinking recursively than others. So some people find that implementation cleaner. The only time I could think it would physically matter may be if you are in a situation where you really have to worry about memory, or your data set is extremely massive. Which is a kind of rare case in my experience in which case I think you would probably reach for iterative.

Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.