Return Statements in Recursion

Question:

So I have a fairly decent understanding of the concept of recursion, but some implementations really trip me up. Take for instance this simple fibonacci function:

def fib(x):
    if x == 0 or x == 1:
        return 1
    else:
        return fib(x-1) + fib(x-2)

I get that this breaks up the fibonacci calculation into smaller more manageable chunks. But how exactly does it come to the end result? What exactly is return returning during the recursive cases? It seems like it is just returning a call to a function that will continue to call the function until it returns 1 — but it never seems to do any really calculations/operations. Contrast this with the classic factorial function:

def factorial(n):
    if n == 1:
         return 1
    else:
         return n * factorial(n) 

Here, the function is clearly operating on n, a defined integer, each time, whereas the fibonacci function only ever operates on the function itself until 1 is returned.

Finally, things get even weirder when we bring something like the Merge Sort algorithm into play; namely this chunk of code:

    middle = int(len(L)/2)
    left = sort(L[:middle], lt)
    right = sort(L[middle:], lt)
    print(left, right)
    return merge(left, right, lt)

left and right seem to be recursively calling sort, yet the print statements seem to indicate that merge is working on every recursive call. So is each recursive call somehow “saved” and then operated on when merge is finally invoked on the return? I’m confusing myself more and more by the second…. I feel like I’m on the verge of a strong understanding of recursion, but my understanding of what exactly return does for recursive calls is standing in my way.

Asked By: user1427661

||

Answers:

Try this exercise:

What’s the value of fib(0)? What’s the value of fib(1)? Let’s write those down.

fib(0) == 1
fib(1) == 1

We know this because these are “base cases”: it matches the first case in the fib definition.

Ok, let’s bump it up. What’s the value of fib(2)? We can look at the definition of the function, and it’s going to be:

fib(2) == fib(1) + fib(0)

We know what the value of fib(1) and fib(0) will be: both of those will do a little work, and then give us an answer. So we know fib(2) will eventually give us a value.

Ok, bump it up. What’s the value of fib(3)? We can look at the definition, and it’s going to be:

fib(3) == fib(2) + fib(1)

and we already know that fib(2) and fib(1) will eventually compute numbers for us. fib(2) will do a little more work than fib(1), but they’ll both eventually bottom out to give us numbers that we can add.

Go for small cases first, and see that when you bump up the size of the problem that the subproblems are things that we’ll know how to handle.

If you’ve gone through a standard high-school math class, you will have seen something similar to this already: mathematicians use what’s called “mathematical induction”, which is the same idea as the recursion we programmers use as a tool.

Answered By: dyoo

Not understanding how recursive functions work is quite common, but it really indicates that you just don’t understand how functions and returning works, because recursive functions work exactly the same as ordinary functions.

print 4

This works because the print statement knows how to print values. It is given the value 4, and prints it.

print 3 + 1

The print statement doesn’t understand how to print 3 + 1. 3 + 1 is not a value, it’s an expression. Fortunately print doesn’t need to know how to print an expression, because it never sees it. Python passes values to things, not expressions. So what Python does is evaluate the expression when the code is executed. In this case, that results in the value 4 being produced. Then the value 4 is given to the print statement, which happily prints it.

def f(x):
    return x + 1

print f(3)

This is very similar to the above. f(3) is an expression, not a value. print can’t do anything with it. Python has to evaluate the expression to produce a value to give to print. It does that by going and looking up the name f, which fortunately finds the function object created by the def statement, and calling the function with the argument 3.

This results the function’s body being executed, with x bound to 3. As in the case with print, the return statement can’t do anything with the expression x + 1, so Python evaluates that expression to try to find a value. x + 1 with x bound to 3 produces the value 4, which is then returned.

Returning a value from a function makes the evaluation of the function-call expression become that value. So, back out in print f(3), Python has successfully evaluated the expression f(3) to the value 4. Which print can then print.

def f(x):
    return x + 2

def g(y):
    return f(y * 2)

print g(1)

Here again, g(2) is an expression not a value, so it needs to be evaluated. Evaluating g(2) leads us to f(y * 2) with y bound to 1. y * 2 isn’t a value, so we can’t call f on it; we’ll have to evaluate that first, which produces the value 2. We can then call f on 2, which returns x + 2 with x bound to 2. x + 2 evaluates to the value 4, which is returned from f and becomes the value of the expression f(y * 2) inside g. This finally gives a value for g to return, so the expression g(1) is evaluated to the value 4, which is then printed.

Note that when drilling down to evaluate f(2) Python still “remembered” that it was already in the middle of evaluating g(1), and it comes back to the right place once it knows what f(2) evaluates to.

That’s it. That’s all there is. You don’t need to understand anything special about recursive functions. return makes the expression that called this particular invocation of the function become the value that was given to return. The immediate expression, not some higher-level expression that called a function that called a function that called a function. The innermost one. It doesn’t matter whether the intermediate function-calls happen to be to the same function as this one or not. There’s no way for return to even know whether this function was invoked recursively or not, let alone behave differently in the two cases. return always always always returns its value to the direct caller of this function, whatever it is. It never never never “skips” any of those steps and returns the value to a caller further out (such as the outermost caller of a recursive function).

But to help you see that this works, lets trace through the evaluation of fib(3) in more detail.

fib(3):
    3 is not equal to 0 or equal to 1
    need to evaluate fib(3 - 1) + fib(3 - 2)
        3 - 1 is 2
        fib(2):
            2 is not equal to 0 or equal to 1
            need to evaluate fib(2 - 1) + fib(2 - 2)
                2 - 1 is 1
                fib(1):
                    1 is equal to 0 or equal to 1
                    return 1
                fib(1) is 1
                2 - 2  is 0
                fib(0):
                    0 is equal to 0 or equal to 1
                    return 1
                fib(0) is 1
            so fib(2 - 1) + fib(2 - 2) is 1 + 1
        fib(2) is 2
        3 - 2 is 1
        fib(1):
            1 is equal to 0 or equal to 1
            return 1
        fib(1) is 1
    so fib(3 - 1) + fib(3 - 2) is 2 + 1
fib(3) is 3

More succinctly, fib(3) returns fib(2) + fib(1). fib(1) returns 1, but fib(3) returns that plus the result of fib(2). fib(2) returns fib(1) + fib(0); both of those return 1, so adding them together gives fib(2) the result of 2. Coming back to fib(3), which was fib(2) + fib(1), we’re now in a position to say that that is 2 + 1 which is 3.

The key point you were missing was that while fib(0) or fib(1) returns 1, those 1s form part of the expressions that higher level calls are adding up.

Answered By: Ben

You need to understand mathematical induction to really grasp the concept. Once it is understood recursion is simply straightforward. Consider a simple function ,

     def fun(a):  
          if a == 0: return a
          else return a + 10

what does the return statement do here? It simply returns a+10. Why is this easy to understand? Of course, one reason is that it doesn’t have recursion.;) Why is the return statement so easy to understand is that it has a and 10 available when it is called.

Now, consider a simple sum of n numbers program using recursion. Now, one important thing before coding a recursion is that you must understand how mathematically it is supposed to work. In the case of sum of n numbers we know that if sum of n-1 numbers is known we could return that sum + n. Now what if do not know that sum. Well, we find sum of n-2 terms and add n-1 to it.

So, sumofN(n) = n + sum(n-1).

Now, comes the terminating part. We know that this cant go on indefinitely. Because sumofN(0) = 0

so,

 sumofN(n) = 0, if n = 0,  
            n + sumofN(n-1) , otherwise

In code this would mean,

      def sumofN(n):  
          if n == 0: return 0  
          return n + sumofN(n-1)

Here suppose we call sumofN(10). It returns 10 + sumofN(9). We have 10 with us. What about the other term. It is the return value of some other function. So what we do is we wait till that function returns. Here, since the function being called is nothing but itself, it waits till sumofN(9) returns. And when we reach 9 + sumofN(8) it waits till sumofN(8) returns.

What actually happens is

10 + sumofN(9) , which is
10 + 9 + sumofN(8), which is
10 + 9 + 8 + sumofN(7) …..

and finally when sumofN(0) returns we have,

10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 + 0 = 55

This concept is all that is needed to understand recursion. :).
Now, what about mergesort?

   mergesort(someArray)  = { l = mergesort the left part of array,
                             r = mergesort the right part of the array,  
                             merge(l and r)  
                           }

Until the left part is available to be returned, it goes on calling mergesort on the "leftest" arrays. Once we have that, we find the right array which indeed finds the "leftest" array. Once we have a left and right we merge them.

One thing about recursion is that it is so damn easy once you look at it from the right perspective and that right perspective is called mathematical induction

Answered By: Emil
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.