Cody Fendant has asked for the wisdom of the Perl Monks concerning the following question:

I was playing around with some code to render a kind of templated HTML, and I came up with the code below.

The HTML contains special markup, <include id="n"> where n is an integer. These are looked up in a db but here they're just returned by a sub to make this question simple and self-contained. In the example below, there are nested includes where include 3 contains a reference to include 4 and 4 contains a reference to 5.

I did the rendering by re-calling the rendering function, inside the rendering function as the right-hand side of a regular expression.

My question is—why does this work? How come it renders all the way down instead of stopping after the first level? I thought it would need some kind of while loop to keep it rendering until it had processed all possible nested includes.

#!/usr/local/bin/perl print render(1); sub render { my $id = shift; $HTML = return_html($id); $HTML =~ s/<include id="(\d+)">/render($1)/eg; return $HTML; } ### this sub is only here to mimic database lookups sub return_html { my $id = shift; if ( $id == 1 ) { return '<html> <head> <include id="2"> <title>hello world</title> </head> <body> <div class="container"> <h1>Hello world</h1> <include id="3"> </div> </body> </html>'; } if ( $id == 2 ) { return '<link rel="stylesheet" href="foo.css">'; } if ( $id == 3 ) { return '<div id="footer">copyright <include id="4"></div>'; } if ( $id == 4 ) { return '2014 <include id="5">'; } if ( $id == 5 ) { return 'and 2015'; } }

Replies are listed 'Best First'.
Re: Recursively executed function in regex RHS, AKA "wait, why does this work?"
by choroba (Cardinal) on Oct 21, 2014 at 11:34 UTC
    Calling a function from the function itself is called "recursion". It's in fact a loop in disguise, in this case it loops until there is no include element in the replacement part. Each include element calls the function again until there are no such elements, then all the functions return.

    Read more on recursion.

    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Recursively executed function in regex RHS, AKA "wait, why does this work?"
by mr_mischief (Monsignor) on Oct 21, 2014 at 13:43 UTC

    I recommend searching the web for "tail recursion" and "recursion vs. iteration" to see just how similar recursion and iterative loops really are. If you have time, get your hands on The Little Lisper or its descendent The Little Schemer and work through the whole thing including the exercises. It's another language, sure, but both are very influential languages and the books are really good at imparting some recursion wisdom. The Little Schemer even has sequels The Seasoned Schemer and The Reasoned Schemer which I'm told are also very good.

Re: Recursively executed function in regex RHS, AKA "wait, why does this work?"
by Athanasius (Archbishop) on Oct 21, 2014 at 15:33 UTC

    Hello Cody Fendant,

    choroba has explained what is happening, but recursion tends to be counter-intuitive and therefore hard to “grasp” or visualise. So I’ve rewritten your example code and added print statements in an attempt to make the logistics clearer:

    #! perl use strict; use warnings; my %includes = ( 1 => join("\n", '<html>', '<head>', ' <include id="2">', ' <title>hello world</title>', '</head>', '<body>', '<div class="container">', ' <h1>Hello world</h1>', ' <include id="3">', '</div>', '</body>', '</html>'), 2 => '<link rel="stylesheet" href="foo.css">', 3 => '<div id="footer">copyright <include id="4"></div>', 4 => '2014', ); render(1); sub render { my $id = shift; my $html = $includes{$id}; print '-' x 10, "\nrender($id) BEFORE:\n$html\n"; $html =~ s/<include id="(\d+)">/render($1)/eg; print "render($id) AFTER:\n$html\n", '-' x 10, "\n"; return $html; }

    Output:

    If you follow this through a few times, it eventually starts to make sense (no, really!). Pay particular attention to the order in which the four calls to sub render are interleaved.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Recursively executed function in regex RHS, AKA "wait, why does this work?"
by Laurent_R (Canon) on Oct 21, 2014 at 22:30 UTC
    There is an old saying according to which, "to understand recursion, you first need to understand recursion". It is indeed a bit difficult to grasp the first time(s), but it is self-evident once you understand.

    The archetypal example of a recursive program is the programming of the factorial function. The factorial of an integer may be defined as the product of that number and all the strictly positive integers smaller than the original number. Thus:

    fact(4) = 4 * 3 * 2 * 1 = 24
    Implementing a function calculating the factorial of a number can easily be achieved with an iterative loop. For example, computing the factorial of 10:
    $ perl -e ' sub fact { my $c = shift; my $result = 1; $result *= $_ for 1..$c; return $result;} print fact (shift);' 10 3628800
    The definition given above of the factorial function is a bit loose. A more precise mathematical definition might be a recursive definition:
    fact(1) = 1 fact (n) = n * fact (n-1)
    The idea is that if I need to compute fact(3), I know that it is equal to 3 * fact(2). And fact(2) = 2 * fact(1) = 2 * 1 = 2. So that, finally, fact (3) = 3 * 2 = 6.

    Implementing this mathematical definition in Perl is quite straight forward:

    $ perl -e ' sub fact { my $c = shift; return 1 if $c == 1; $c * fact($c-1); } print fact (shift);' 10 3628800
    This code is doing exactly what I have described with the mathematical definition: the fact function is called recursively with n-1, n-2, etc., until the argument is 1, at which point all the function call returns are executed to return the final value.

    Once you understand the process, this is a fairly efficient way of coding things like trees or chained lists traversing algorithms. For example, traversing a directory tree is usually much easier to code with a recursive algorithm than with an iterative one. But usually a bit less efficient time wise if that matters.

Re: Recursively executed function in regex RHS, AKA "wait, why does this work?"
by jellisii2 (Hermit) on Oct 21, 2014 at 11:52 UTC
    g in your regex means that it'll happen globally, so every instance will get processed, so line 7 actually gets fired multiple times. It APPEARS that since the line is getting modified after the current evaluation, that gets caught and processed. Testing seems to confirm this, as when you replace the IDs with 5 it works as expected. As to an actual reason as to WHY this is, I'd be interested in knowing myself. I assume that backtracking has something to do with this. perlre's section on backtracking may explain it better.

    See Choroba's response. I missed the quite obvious fact that it was calling itself.

      You're still right about /g being necessary for this to work. The recursion processes the descendants and the /g modifier makes sure all siblings are processed. There are actually two loop-like constructs in this code. s/REG/REPL/g does not however process the replaced text, or something like s/(.)/$1$1/g; would be an infinite loop.