How to find the recurrence relation, and calculate Master Theorem of a Merge Sort Code?

Question:

I’m trying to find the Master Theorem of this Merge Sort Code, but first I need to find its recurrence relation, but I’m struggling to do and understand both. I already saw some similar questions here, but couldn’t understand the explanations, like, first I need to find how many operations the code has? Could someone help me with that?


def mergeSort(alist):
    print("Splitting ",alist)
    if len(alist)>1:
        mid = len(alist)//2
        lefthalf = alist[:mid]
        righthalf = alist[mid:]

        mergeSort(lefthalf)
        mergeSort(righthalf)

        i=0
        j=0
        k=0
        while i < len(lefthalf) and j < len(righthalf):
            if lefthalf[i] < righthalf[j]:
                alist[k]=lefthalf[i]
                i=i+1
            else:
                alist[k]=righthalf[j]
                j=j+1
            k=k+1

        while i < len(lefthalf):
            alist[k]=lefthalf[i]
            i=i+1
            k=k+1

        while j < len(righthalf):
            alist[k]=righthalf[j]
            j=j+1
            k=k+1
    print("Merging ",alist)

alist = [54,26,93,17,77,31,44,55,20]
mergeSort(alist)
print(alist)

Asked By: Felipe L

||

Answers:

To determine the run-time of a divide-and-conquer algorithm using the Master Theorem, you need to express the algorithm’s run-time as a recursive function of input size, in the form:

T(n) = aT(n/b) + f(n)

T(n) is how we’re expressing the total runtime of the algorithm on an input size n.

a stands for the number of recursive calls the algorithm makes.

T(n/b) represents the recursive calls: The n/b signifies that the input size to the recursive calls is some particular fraction of original input size (the divide part of divide-and-conquer).

f(n) represents the amount of work you need to do to in the main body of the algorithm, generally just to combine solutions from recursive calls into an overall solution (you could say this is the conquer part).

Here’s a slightly re-factored definition of mergeSort:

def mergeSort(arr):
  if len(arr) <= 1: return # array size 1 or 0 is already sorted
  
  # split the array in half
  mid = len(arr)//2
  L = arr[:mid]
  R = arr[mid:]

  mergeSort(L) # sort left half
  mergeSort(R) # sort right half
  merge(L, R, arr) # merge sorted halves

We need to determine, a, n/b and f(n)

Because each call of mergeSort makes two recursive calls: mergeSort(L) and mergeSort(R), a=2:

T(n) = 2T(n/b) + f(n)

n/b represents the fraction of the current input that recursive calls are made with. Because we are finding the midpoint and splitting the input in half, passing one half the current array to each recursive call, n/b = n/2 and b=2. (if each recursive call instead got 1/4 of the original array b would be 4)

T(n) = 2T(n/2) + f(n)

f(n) represents all the work the algorithm does besides making recursive calls. Every time we call mergeSort, we calculate the midpoint in O(1) time.
We also split the array into L and R, and technically creating these two sub-array copies is O(n). Then, presuming mergeSort(L), sorted the left half of the array, and mergeSort(R) sorted the right half, we still have to merge the sorted sub-arrays together to sort the entire array with the merge function. Together, this makes f(n) = O(1) + O(n) + complexity of merge. Now let’s take a look at merge:

def merge(L, R, arr):
  i = j = k = 0    # 3 assignments
  while i < len(L) and j < len(R): # 2 comparisons
    if L[i] < R[j]: # 1 comparison, 2 array idx
      arr[k] = L[i] # 1 assignment, 2 array idx
      i += 1        # 1 assignment
    else:
      arr[k] = R[j] # 1 assignment, 2 array idx
      j += 1        # 1 assignment
    k += 1          # 1 assignment

  while i < len(L): # 1 comparison
    arr[k] = L[i]   # 1 assignment, 2 array idx
    i += 1          # 1 assignment
    k += 1          # 1 assignment

  while j < len(R): # 1 comparison
    arr[k] = R[j]   # 1 assignment, 2 array idx
    j += 1          # 1 assignment
    k += 1          # 1 assignment

This function has more going on, but we just need to get it’s overall complexity class to be able to apply the Master Theorem accurately. We can count every single operation, that is, every comparison, array index, and assignment, or just reason about it more generally. Generally speaking, you can say that across the three while loops we are going to iterate through every member of L and R and assign them in order to the output array, arr, doing a constant amount of work for each element. Noting that we are processing every element of L and R (n total elements) and doing a constant amount of work for each element would be enough to say that merge is in O(n).

But, you can get more particular with counting operations if you want. For the first while loop, every iteration we make 3 comparisons, 5 array indexes, and 2 assignments (constant numbers), and the loop runs until one of L and R is fully processed. Then, one of the next two while loops may run to process any leftover elements from the other array, performing 1 comparison, 2 array indexes, and 3 variable assignments for each of those elements (constant work). Therefore, because each of the n total elements of L and R cause at most a constant number of operations to be performed across the while loops (either 10 or 6, by my count, so at most 10), and the i=j=k=0 statement is only 3 constant assignments, merge is in O(3 + 10*n) = O(n). Returning to the overall problem, this means:

f(n) = O(1) + O(n) + complexity of merge
     = O(1) + O(n) + O(n)
     = O(2n + 1)
     = O(n)

T(n) = 2T(n/2) + n

One final step before we apply the Master Theorem: we want f(n) written as n^c. For f(n) = n = n^1, c=1. (Note: things change very slightly if f(n) = n^c*log^k(n) rather than simply n^c, but we don’t need to worry about that here)

You can now apply the Master Theorem, which in its most basic form says to compare a (how quickly the number of recursive calls grows) to b^c (how quickly the amount of work per recursive call shrinks). There are 3 possible cases, the logic of which I try to explain, but you can ignore the parenthetical explanations if they aren’t helpful:

  1. a > b^c, T(n) = O(n^log_b(a)). (The total number of recursive calls is growing faster than the work per call is shrinking, so the total work is determined by the number of calls at the bottom level of the recursion tree. The number of calls starts at 1 and is multiplied by a log_b(n) times because log_b(n) is the depth of the recursion tree. Therefore, total work = a^log_b(n) = n^log_b(a))

  2. a = b^c, T(n) = O(f(n)*log(n)). (The growth in number of calls is balanced by the decrease in work per call. The work at each level of the recursion tree is therefore constant, so total work is just f(n)*(depth of tree) = f(n)*log_b(n) = O(f(n)*log(n))

  3. a < b^c, T(n) = O(f(n)). (The work per call shrinks faster than the number of calls increases. Total work is therefore dominated by the work at the top level of the recursion tree, which is just f(n))

For the case of mergeSort, we’ve seen that a = 2, b = 2, and c = 1. As a = b^c, we apply the 2nd case:

T(n) = O(f(n)*log(n)) = O(n*log(n))

And you’re done. This may seem like a lot work, but coming up with a recurrence for T(n) gets easier the more you do it, and once you have a recurrence it’s very quick to check which case it falls under, making the Master Theorem quite a useful tool for solving more complicated divide/conquer recurrences.

Answered By: inordirection