wrinkles has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

The first example code of HOP is something like this (I added print statements):
sub binary { my ($n) = @_; return $n if $n == 0 || $n == 1; my $k = int($n/2); my $b = $n % 2; print "n is $n.\nk is $k.\nb is $b.\n"; print "above\n"; my $E = binary($k); print "below\n"; print "n is $n.\nE is $E.\n\n"; return $E . $b; } print binary(8);
And the output:
n is 8. k is 4. b is 0. above n is 4. k is 2. b is 0. above n is 2. k is 1. b is 0. above below n is 2. E is 1. below n is 4. E is 10. below n is 8. E is 100. 1000
The program "loops" 3 times before the final return. Obviously it is not a simple loop, though, and that is what confuses me. I don't understand the recursion process. I naively expect above/below above/below above/below return sequence, rather than above/above/above below/below/below return.

UPDATE: And notice the sequence of "n" value is an inside-out sequence. What?

I'm not seeing this, please help me understand. Thanks.

UPDATE 2: Thanks to all Monks for your generous and informative replies. PerlMonks is certainly the best resource for information in the Perl Community.

Replies are listed 'Best First'.
Re: recursion basics
by aaron_baugher (Curate) on Jun 23, 2015 at 06:32 UTC

    With recursion, a function can call itself, and each call creates a new copy of the function, complete with its own local variables. Each time the function calls itself, the first instance of it stops and waits for the second one to finish -- which may call a third instance and wait for that to finish, and so on.

    In the case of your example, the first call to binary() creates a running copy of the function (let's call it binary1) and passes it the value 8. It calculates a few things and sets a few variables which are local to this instance ($n, $k, $b). Then it makes a call to binary(), passing it the value 4. This creates a second running copy of binary() (binary2), which will have its own copy of those variables ($n, $k, $b), completely separate from binary1's copies. As soon as binary2 is called, binary1 stops and waits for it to return.

    So now binary2 is running, and goes through the same process of setting variables, except that this time the values are different because it was called with a different argument. It gets down to a third call to binary(), to which it now passes $k = 2. Now binary2 stops and waits for this third copy (binary3) to return. Binary3 sets its own copies of the variables, and passes the value 1 to a fourth call to binary() (binary4), while binary3 stops and waits for binary4 to return.

    Now there are four copies of the binary() function running, each with its own local copies of several variables, and the first three each waiting on the next to finish. When binary4 hits the return() line, the test returns true this time ($n==1), so the whole thing starts unspooling. Binary4 returns $n back to binary3, which puts that value in $E and continues where it stopped. When binary3 gets to its last instruction and returns, binary2 picks up where it stopped by setting $E to the return value of binary3. Binary2 finishes and returns, letting binary1 set $E to the return value of binary2. Binary1 (the very first call to the function) now continues and finishes, and passes its own return value to print().

    Recursion can be hard to get a grip on at first, but step through the process one step at a time, keeping in mind that each new call to the function creates a new copy of it, with its own copy of any my/local variables. Remember that each time the function calls itself, the caller stops and waits for the called to return. If necessary, draw it out on paper, making a separate box for each copy of the function, writing its variables inside it, with arrows to point from each copy back to the point in its caller where it will return to.

    Aaron B.
    Available for small or large Perl jobs and *nix system administration; see my home node.

      Wow, nice explanation. I'll get a good night's rest and read that again slowly, and yes draw a diagram. Thanks to all for your help!
Re: recursion basics
by robby_dobby (Hermit) on Jun 23, 2015 at 06:41 UTC
    Hello wrinkles,

    The thing about recursion is, (and this is what trips up most learners) it does NOT compute results UNTIL it has reached the terminating condition. That is, it would go on accumulating non-computed results and builds the final value once it reaches the end of the recursion condition.

    Now, it should be seen that for recursion to happen - it always needs two factors in place:

    • The means to compute the next thunk (this is what some people call a non-computed result, particularly in Functional Programming communities). This should already be given, it's just the same function - binary above.
    • The terminating condition. In the case of binary - the subroutine would just keep chipping down the number until there's nothing left. If it reaches that stage, all it has to do is prepend all previous results. 0 0 1, worked backwards would become 1 0 0 and you have your answer.

    Hopefully, I can make this clear by using a well known factorial example.

    sub factorial { my ($n) = shift; if ($n < 2) { return 1; } else { return $n * factorial($n - 1); } }

    Now, consider what would happen if you were to invoke it as: factorial(5). You have the terminating condition and the means to compute the next thunk. It should look something like below:

    factorial(5)
    5 * factorial(4)
    5 * 4 * factorial(3)
    5 * 4 * 3 * factorial(2)
    5 * 4 * 3 * 2 * factorial(1)
    5 * 4 * 3 * 2 * 1
    = 120
    

    Understood so far? Let's take a look at binary:

    • There are 3 variables here: The number itself, the chipped off previous number ($k) and the bit value of that number ($b - this is just what we would get if we divide by 2 and see the remainder - you could simply do $k % 2).
    • Is the number broken down completely such that there's no more left? $n == 0 || $n == 1?. if so, just return 1. (It'll never reach down to 0, hopefully :-)
    • So, there's more to this number - let's compute its bit value and create our thunk: $E = binary($k). We pass in our chipped down number.
    • Ah, so we reached the terminating condition. Finally, we have our value result, not a thunk! Great - let's now prepend all our results back in with so many of these thunks: return $E. $b;. So they all become: 0 0 1 --> 1 0 0

    Hopefully, it's all a bit more clear now. You can read more about recursion on Wikipedia here.

      I will have to look at all this info with fresh eyes in the morning, but my overall impression is that the code and data structure is build "outside-in", then evaluated "inside-out" to reach the final return.

      This is almost making sense! Thanks again!
Re: recursion basics
by NetWallah (Canon) on Jun 23, 2015 at 06:08 UTC
    Hopefully, the next few statements will make sense when read slowly.:

    The (recursive) call to binary() occurs between the "print ABOVE" and "print BELOW"

    This means that before the BELOW gets printed, the sub is called again, going through he "ABOVE" logic.

    This causes ABOVE to be printed again .. and again on each recursive call, not giving BELOW a chance until the recursion completes.

    Now - the RECURSED instance also goes through the same steps, and is unable to print BELOW until the NEXT recursion completes.

    So, the "BELOW"s stack up and are released together as the recursion unwinds.

    If you use the perl "-d" switch, you can step through the recursion, and even do a "T" stack trace to see what the sequence is.

            "Despite my privileged upbringing, I'm actually quite well-balanced. I have a chip on both shoulders."         - John Nash

      Even slow that's a challenge. Why does the final return only run once? I mean, why doesn't the return interrupt the unwinding earlier?

        Check my first reply for a more detailed explanation, but the short version is that, each time the function calls itself, the caller stops and waits for the "child" function to finish. So none of the instances reach that final return until one of them returns from the first return, then they all finish and hit the second return in reverse order. Something like this:

        binary1 calls binary2, stops and waits for it binary2 calls binary3, stops and waits for it binary3 calls binary4, stops and waits for it binary4 returns from the first return line ($n==1) binary3 finishes, returns binary2 finishes, returns binary1 finishes, returns print statement executes, printing binary1's return value

        Aaron B.
        Available for small or large Perl jobs and *nix system administration; see my home node.

        Read my post here. Hopefully, it should make it clear to you.

        Even slow that's a challenge. Why does the final return only run once? I mean, why doesn't the return interrupt the unwinding earlier?

        Um, try this, and ask yourself what happens

        Rip(); sub Rip { print "Rip on\n"; Van(); Winkle(); print "Rip off\n"; } sub Van { print "Van on\n"; Winkle(); print "Van off\n"; } sub Winkle { print "Winkle\n"; } __END__ Rip on Van on Winkle Van off Winkle Rip off
Re: recursion basics
by AnomalousMonk (Archbishop) on Jun 23, 2015 at 06:27 UTC
Re: recursion basics
by marinersk (Priest) on Jun 23, 2015 at 12:40 UTC

    Perhaps a diagram will help.

    print "main start\n"; &mySub(4); print "main end\n"; sub mySub { my ($oldValue, $callDepth, @leftOvers) = @_; if (!defined $oldValue) { $oldValue = 0; } if (!defined $callDepth) { $callDepth = 0; } # Increment call depth each time we get into mySub() $callDepth++; my $indentSize = $callDepth * 2; my $indentSpaces = sprintf("%${indentSize}s", ""); # Show we've arrived -- indent according to call depth print "${indentSpaces}mySub($oldValue) BEGIN\n"; # Cut value in half and recurse until zero my $newValue = int($oldValue / 2); if ($newValue > 0) { &mySub($newValue, $callDepth); } # Show we've returned -- indent according to call depth print "${indentSpaces}mySub($oldValue) END\n"; # Bye-bye return; }

    Results:

    main start mySub(4) BEGIN mySub(2) BEGIN mySub(1) BEGIN mySub(1) END mySub(2) END mySub(4) END main end

    Diagram of how it works (conceptually, if not implementorily):

      main start   main() mySub()
        ^
        |
        +-- You are here
            You call mySub(4)
      mySub(4) BEGIN   main() mySub(4)
                ^
                |
                +-- You are here
                    You call mySub(2)
        mySub(2) BEGIN   main() mySub(4)
                mySub(2)
                  ^
                  |
                  +-- You are here
                      You call mySub(1)
          mySub(1) BEGIN
     
     
     
          mySub(1) END
      main() mySub(4)
                mySub(2)
                  mySub(1)
                    ^
                    |
                    +-- You are here
      
        mySub(2) END   main() mySub(4)
                mySub(2)
                  ^
                  |
                  +-- You are here
      mySub(4) END   main() mySub(4)
                ^
                |
                +-- You are here
      main end   main() mySub()
        ^
        |
        +-- You are here

Re: recursion basics
by fullermd (Vicar) on Jun 23, 2015 at 09:22 UTC

    I naively expect above/below above/below above/below return sequence, rather than above/above/above below/below/below return.

    print "above\n"; my $E = binary($k); print "below\n";

    No, it has to finish the binary($k) before it can do the print "below\n". And that binary() call has its own above/binary()/below sequence to follow. It would be like saying:

    print "x\n"; print "y\n"; print "z\n";

    You wouldn't expect that to output "x/z/y", would you? Each line has to finish before the next one starts, and the recursive binary() call will (from the top) also involve another recursive call before it returns to the original call.

    It may be simpler to look a version without the calculations, just showing the call path:

    # How many steps we want to take my $steps = 3; sub recur { my ($x) = @_; # Create a number of leading spaces equal to our current $x my $ldr = " " x $x; # Note that we're starting print "${ldr}Start with $x\n"; # We only go to $steps to make it a smallish example if($x >= $steps) { print "${ldr}At $steps, returning.\n"; return; } # Recurse with the next number. Say we're doing it, do it, say we +'re # done. my $y = $x + 1; print "${ldr}-> Subcall with $y\n"; recur($y); print "${ldr}<- Back from $y\n"; # And now we're done with this number, so step back print "${ldr}Done with $x\n"; } # Start off recur(0);

    So now we're just calling ourselves $steps times, and noting where we go. Output:

    % ./tst.pl Start with 0 -> Subcall with 1 Start with 1 -> Subcall with 2 Start with 2 -> Subcall with 3 Start with 3 At 3, returning. <- Back from 3 Done with 2 <- Back from 2 Done with 1 <- Back from 1 Done with 0

    So what happens? We start off by calling recur() with 0. So that says it's doing so, then re-calls itself with 1. That says it's doing so, re-calls with 2. Says it's doing so, re-calls with 3.

    Now the 3 notices that's our limit, and so it returns. It returns back to the re-call that had 2, which now completes, and returns. It returns back to the re-call that had 1, which now completes, and returns. It returns back to the original call with 0, which returns. And then it falls off the end of the script.

Quicksort (Re: recursion basics)
by roboticus (Chancellor) on Jun 27, 2015 at 12:51 UTC

    wrinkles:

    You've already received plenty of good answers, but I'm sick and bored this morning, so I'll add my version to the mix. Many examples of recursion use the factorial problem, which I don't care for, as it's more easily and efficiently implemented as a loop. With an example like that, it's hard to "get" recursion, as many students will think that recursion is simply a crappy way to do a loop. The Tower of Hanoi, mentioned earlier in this thread is a really good recursion example. The one I like is the quicksort algorithm, which is the one I work through below.


    Suppose you need to write a function that sorts a list of numbers:

    sub list_sort { my @vals = @_; ... code to sort list ... return @sorted_list; }

    There are lots of ways to sort lists, some of which you probably already know. But ignore that for now. At some point, when thinking about the problem, you might notice that you can break the problem into a combination of smaller pieces.

    As an example, suppose you break your list down into two lists: all numbers smaller than some threshold in the first list, and all the remaining numbers in the second list. Then you could sort your list by sorting the two smaller lists and concatenating them, something like:

    sub list_sort { my @vals = @_; my @smaller = get_smaller(@vals); my @larger = get_larger(@vals); return list_sort(@smaller), list_sort(@larger); }

    The first condition to using recursion is that you must be able to break the original problem into a combination of smaller parts.

    So, what's the second condition to using recursion? Simply put, it's having a way to *stop* using recursion. Typically, when you break a problem into smaller and smaller parts, you'll reach a point where you have a trivial (degenerate) solution--one where you know the answer already.

    In the example of sorting a list, where we're breaking the list into smaller and smaller pieces, eventually we'll reach a point where our list contains 0 or 1 items. If you have 0 or 1 items in a list it's already sorted, so you can simply return the list.

    This means that our sorting routine turns into something like:

    sub list_sort { my @vals = @_; # trivial case: 0 or 1 items in the list return @vals if @vals < 2; my @smaller = get_smaller(@vals); my @larger = get_larger(@vals); return list_sort(@smaller), list_sort(@larger); }

    With that, our sort routine is pretty much done. For recursive routines, if you can describe the problem as a combination of smaller versions of the problem, and there's a terminating condition, you can create a recursive routine to solve it. The general form is normally just like that one above: return the answer if it's trivial, otherwise break the problem into one or more smaller parts, solve those, and compute the result from the smaller parts.

    The only detail we've left out in our list_sort routine is the part where we split the list into two smaller lists. This is typically the tricky part of a recursive routine: knowing how to split and combine the results to get the answer. For our sort, we have to split our input list into two lists such that all the values in the first list are smaller than all the values in the second list. How do we do that? If we just choose an arbitrary value out of the air, we could easily choose a value that's lower or higher than *all* numbers in our list, in which case we wouldn't be breaking the problem into smaller parts: one of the lists would be the *same* as the input list, so we'd fail the first condition--we wouldn't be breaking the problem into smaller parts.

    The inventor of quicksort had a clever idea: Use the first number of the incoming list as the threshold to split the original list. By doing so, we can *remove* the first number from the problem entirely, thus guaranteeing that the two remaining sorts would be smaller than the original list. The routine morphs into something like this:

    sub list_sort { my @vals = @_; # trivial case: 0 or 1 items in the list return @vals if @vals < 2; my $split_val = shift @vals; my @smaller = grep { $_ < $split_val } @vals; my @larger = grep { $_ > $split_val } @vals; return list_sort(@smaller), list_sort(@larger); }

    Here's a demo script that does our list sort, but with some extra junk in it to tell you what's happening so you can see it in action:

    $ cat qsort.pl #!/usr/bin/env perl # # example of recursion using quicksort # use strict; use warnings; # quicksort -- recursively sort a list by taking the first # element and using that as the "center" value, and breaking # the rest of the list into two lists: those lower than the # center value and those higher than the center value. # # then return qsort(lower), center, qsort(higher) # # The degenerate case is when you have a single value in your # list -- simply return it. # # This version helps illustrate the recursion by describing # what it's doing, with indentation to show the nesting of the # calls. # sub list_sort { my ($lev, @vals) = @_; #print " "x$lev, "qs(", join(" ", @vals), ")\n"; print " "x$lev, "qs(@vals)\n"; ++$lev; if (@vals < 2) { print " "x$lev, "degenerate case, returning (@vals)\n"; return @vals; } my $split_val = shift @vals; my @lower = grep { $_ lt $split_val } @vals; my @higher = grep { $_ gt $split_val } @vals; print " "x$lev, "(@lower), $split_val, (@higher)\n"; @vals = ( list_sort($lev, @lower), $split_val, list_sort($lev, @higher) ); print " "x$lev, "returning (@vals)\n"; return @vals; } my @list = qw(c a h f g e b d i); print "ORIG: @list\n"; @list = list_sort(0, @list); print "DONE: @list\n"; $ perl qsort.pl ORIG: c a h f g e b d i qs(c a h f g e b d i) (a b), c, (h f g e d i) qs(a b) (), a, (b) qs() degenerate case, returning () qs(b) degenerate case, returning (b) returning (a b) qs(h f g e d i) (f g e d), h, (i) qs(f g e d) (e d), f, (g) qs(e d) (d), e, () qs(d) degenerate case, returning (d) qs() degenerate case, returning () returning (d e) qs(g) degenerate case, returning (g) returning (d e f g) qs(i) degenerate case, returning (i) returning (d e f g h i) returning (a b c d e f g h i) DONE: a b c d e f g h i

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: recursion basics
by locked_user sundialsvc4 (Abbot) on Jun 23, 2015 at 12:57 UTC

    A useful way to see recursion in action is to add a variable to the function&rquo;s parameter list which is simply used to show the recursion nesting-level . . .

    use strict; use warnings; sub foo { my $nesting_level = shift; print "Hello from level $nesting_level!\n"; if ($nesting_level < 3) { foo($nesting_level+1); } print "Goodbye from level $nesting_level!\n"; } foo(1); . . . gives . . . Hello from level 1! Hello from level 2! Hello from level 3! Goodbye from level 3! Goodbye from level 2! Goodbye from level 1!

    The outermost call begins first but finishes last.   The innermost call, which does not “call itself,” of course finishes immediately.

    This trivial example does not illustrate that each instance of the recursive function (like any instance of any function-call), has its own distinct set of local variables.

      Thank you for posting working code. It is a clear example. I tweaked a little. Not comprehensive but indentation, as shown in previous examples from other monks, in particular is good for recursion examples, I think.

      sub foo { # 4 spaces or tab is standard. Why be different here? # Even this spotty arg check is better than none. my $nesting_level = shift || die "Nesting level needed!"; # Indentation is a great way to visualize recursion. my $indent = " " x ( $nesting_level - 1 ); print $indent, "Hello from level $nesting_level!\n"; # It's Perl, why not postfix? foo($nesting_level+1) if $nesting_level < 3; print $indent, "Goodbye from level $nesting_level!\n"; }
      Hello from level 1! Hello from level 2! Hello from level 3! Goodbye from level 3! Goodbye from level 2! Goodbye from level 1!