in reply to Q on HTML::Element recursive lambda comment

I agree that the explanatory text is a bit nutty, but there is a reason. It's not so that the code will work correctly, or better, or whatnot. It's so that you won't cause a memory leak if you reference the code again and again.

As a demonstration, put the above code inside a sub, and then have something that calls that subroutine, say, 16_000_000 times. (Call it on the same very, very small tree - I recommend a tree that's no more than one element in size)

Now try the same thing without the undef.

The difference between the two sets of examples he gives is that the bad examples create a reference from $give_id to itself - that is, they create a recursive data structure. Perl's garbage collector doesn't handle recursive structures, so you shouldn't leave them around after a variable goes out of scope.

Replies are listed 'Best First'.
Re: Re: Q on HTML::Element recursive lambda comment
by lucylane (Acolyte) on Mar 08, 2004 at 20:08 UTC
    Okay, duh, thanks.

    Another question though; here's why I often use the "anonymous lambda" style (dynamic sub): if I do something like this (embedded sub):

    sub traverse { my $start_node = HTML::TreeBuilder->new; $start_node->parse_file(shift); { my $counter = 'x0000'; sub give_id { my $x = $_[0]; $x->attr('id', $counter++) unless defined $x->attr('id'); foreach my $c ($x->content_list) { give_id($c) if ref $c; # ignore text nodes } }; give_id($start_node); } }
    The var $counter is captured in the closure and does not get reset even though I'd like it to be reset. It's either this way and moving and resetting $counter by hand outside the block (which I won't do because it's logically inside the function and also easy to forget) or using the dynamic sub and risk a memory leak - how would I get the best of both worlds (uncaptured var + no risk of memory leak). Using a private embedded package or putting everything in it's own package seems wordy.

    Thanks for the enlightenment.

      Well, I'd do it by leaving in the undef statement. Either that, or replace it with something commented:
      { ... $give_id->($start_node); $give_id = 0; # Break circular structure and avoid memory leak }
      That is, just make sure to set $give_id to some value that doesn't depend on $give_id before $give_id goes out of scope.
Re: Re: Q on HTML::Element recursive lambda comment
by fizbin (Chaplain) on Mar 08, 2004 at 21:07 UTC
    For what it's worth, I just tried abusing my laptop with this code:
    use strict; use HTML::Element; sub funny { my $counter = 'x0000'; my $give_id; $give_id = sub { my $x = $_[0]; $x->attr('id', $counter++) unless defined $x->attr('id'); foreach my $c ($x->content_list) { $give_id->($c) if ref $c; # ignore text nodes } }; $give_id->($_[0]); undef $give_id; ##### Remove for evil effects ##### } my $a = HTML::Element->new('a', href => 'http://www.perl.com/'); print "Start: ", scalar(localtime), "\n"; for my $i (0..1_000_000) { funny($a); } print "Finish: ", scalar(localtime), "\n";
    With the undef in place, the difference between the "Start" and "Finish" times was 15 seconds. With the undef commented out, the difference between "Start" and "Finish" was only a little over a minute (1:04); however, five minutes after printing the "Finish" line my laptop was still swapping like mad and the shell prompt hadn't yet come back - this indicates that perl was having a devil of a time cleaning things up on exit. (I just gave up and hit Ctrl-C at that point)

    Incidentally, while the version without an undef was running, it took a full 30 seconds for my laptop to display the Ctrl-Alt-Del screen after I gave it the three-finger salute.

    I'm not going to try the 16_000_000 times I suggested above.