Q on HTML::Element recursive lambda comment

lucylane has asked for the wisdom of the Perl Monks concerning the following question:

In the HTML::Element::traverse documentation, there's a comment I can't figure out:

Now, if you've been reading Structure and Interpretation of Computer Programs too much, maybe you even want a recursive lambda. Go ahead:
{ my $counter = 'x0000'; my $give_id; $give_id = sub { my $x = $_[0]; $x->attr('id', $counter++) unless defined $x->attr('id'); foreach my $c ($x->content_list) { $give_id->($c) if ref $c; # ignore text nodes } }; $give_id->($start_node); undef $give_id; }
[download]
It's a bit nutty, and it's still more concise than a call to the traverse method! It is left as an exercise to the reader to figure out how to do the same thing without using a $give_id symbol at all. It is also left as an exercise to the reader to figure out why I undefine $give_id, above; and why I could achieved the same effect with any of:
$give_id = 'I like pie!'; # or... $give_id = []; # or even; $give_id = sub { print "Mmmm pie!\n" };
[download]
But not:
$give_id = sub { print "I'm $give_id and I like pie!\n" }; # nor... $give_id = \$give_id; # nor... $give_id = { 'pie' => \$give_id, 'mode' => 'a la' };
[download]

I've tried test programs to give me an error or other bad output but everything looks the same (w/ and w/o the undef). This is a construct that I like to use in my code and this comment bothers me. (Do others agree it's a little bit nutty btw?)

Thank you for any wisdom.

Comment on Q on HTML::Element recursive lambda comment Select or Download Code

Replies are listed 'Best First'.
Re: Q on HTML::Element recursive lambda comment by fizbin (Chaplain) on Mar 08, 2004 at 19:15 UTC
I agree that the explanatory text is a bit nutty, but there is a reason. It's not so that the code will work correctly, or better, or whatnot. It's so that you won't cause a memory leak if you reference the code again and again. As a demonstration, put the above code inside a `sub`, and then have something that calls that subroutine, say, 16_000_000 times. (Call it on the same very, very small tree - I recommend a tree that's no more than one element in size) Now try the same thing without the `undef`. The difference between the two sets of examples he gives is that the bad examples create a reference from `$give_id` to itself - that is, they create a recursive data structure. Perl's garbage collector doesn't handle recursive structures, so you shouldn't leave them around after a variable goes out of scope.	[reply] [d/l] [select]
Re: Re: Q on HTML::Element recursive lambda comment by lucylane (Acolyte) on Mar 08, 2004 at 20:08 UTC
Okay, duh, thanks. Another question though; here's why I often use the "anonymous lambda" style (dynamic sub): if I do something like this (embedded sub): `sub traverse { my $start_node = HTML::TreeBuilder->new; $start_node->parse_file(shift); { my $counter = 'x0000'; sub give_id { my $x = $_[0]; $x->attr('id', $counter++) unless defined $x->attr('id'); foreach my $c ($x->content_list) { give_id($c) if ref $c; # ignore text nodes } }; give_id($start_node); } }` [download] The var $counter is captured in the closure and does not get reset even though I'd like it to be reset. It's either this way and moving and resetting $counter by hand outside the block (which I won't do because it's logically inside the function and also easy to forget) or using the dynamic sub and risk a memory leak - how would I get the best of both worlds (uncaptured var + no risk of memory leak). Using a private embedded package or putting everything in it's own package seems wordy. Thanks for the enlightenment.	[reply] [d/l]
Re: Re: Re: Q on HTML::Element recursive lambda comment by fizbin (Chaplain) on Mar 08, 2004 at 20:24 UTC
Well, I'd do it by leaving in the `undef` statement. Either that, or replace it with something commented: `{ ... $give_id->($start_node); $give_id = 0; # Break circular structure and avoid memory leak }` [download] That is, just make sure to set `$give_id` to some value that doesn't depend on `$give_id` before `$give_id` goes out of scope.	[reply] [d/l] [select]
Re: Re: Q on HTML::Element recursive lambda comment by fizbin (Chaplain) on Mar 08, 2004 at 21:07 UTC
For what it's worth, I just tried abusing my laptop with this code: use strict; use HTML::Element; sub funny { my $counter = 'x0000'; my $give_id; $give_id = sub { my $x = $_[0]; $x->attr('id', $counter++) unless defined $x->attr('id'); foreach my $c ($x->content_list) { $give_id->($c) if ref $c; # ignore text nodes } }; $give_id->($_[0]); undef $give_id; ##### Remove for evil effects ##### } my $a = HTML::Element->new('a', href => 'http://www.perl.com/'); print "Start: ", scalar(localtime), "\n"; for my $i (0..1_000_000) { funny($a); } print "Finish: ", scalar(localtime), "\n"; [download] With the `undef` in place, the difference between the "Start" and "Finish" times was 15 seconds. With the undef commented out, the difference between "Start" and "Finish" was only a little over a minute (1:04); however, five minutes after printing the "Finish" line my laptop was still swapping like mad and the shell prompt hadn't yet come back - this indicates that perl was having a devil of a time cleaning things up on exit. (I just gave up and hit Ctrl-C at that point) Incidentally, while the version without an `undef` was running, it took a full 30 seconds for my laptop to display the Ctrl-Alt-Del screen after I gave it the three-finger salute. I'm not going to try the 16_000_000 times I suggested above.	[reply] [d/l] [select]
Re: Q on HTML::Element recursive lambda comment by kvale (Monsignor) on Mar 08, 2004 at 19:19 UTC
First of all, Sean Burke writes eclectic prose and doesn't write code so much as play with code. I love his style :) In this bit of the doc, he is rewriting the simple code traversal routine `{ my $counter = 'x0000'; sub give_id { my $x = $_[0]; $x->attr('id', $counter++) unless defined $x->attr('id'); foreach my $c ($x->content_list) { give_id($c) if ref $c; # ignore text nodes } }; give_id($start_node); }` [download] to be more lispish. In particular, he is assigning an anonymous sub to a lexical, rather than just creating a give_id sub. Doing the former promotes the sub to a first class object, i.e., it can be assigned to a variable and passed around the same as any other object. Now, you would expect, $giv_id to go out of scope at the end of the block, but it is used in the anonymous sub, so this 'closure' lives on unless you destroy it explicitly. This may or may not be what you want to do, depending on the rest of your code. I think the code is perfectly fine to use. It is not very nutty and should pose no problems. But the non-lispy code is a little simpler. -Mark	[reply] [d/l]
Re: Re: Q on HTML::Element recursive lambda comment by flyingmoose (Priest) on Mar 08, 2004 at 19:55 UTC
Note: I'm (as usual) off-topic... I really like it when languages allow me to be lispy -- even though I really dislike being lispy in lisp! Lots of insignificant parenthesis -- Go figure! I'd say this ability of Perl (to allow funky functional constructs without the painful restrictions of pure-functional languages) is 70% of my love of the language. (CPAN and the data structure support fill the other 30%). I don't know how many times I've wanted to do something with lamba functions and closures in C, and then I kick myself -- doh -- you can't do that! Java does a little better with the anonymous inner classes, but the syntax is horribly cludgy so I can't give it any browny points. I have this one build engine I'm rather proud of that heavily abuses map and anonymous subs. Why? Well, to keep people from messing with my build system -- err, no, because it's insanely powerful. It is, however, amazing how many self-proclaimed OO Gods can't grok functional code. I like it! Back on topic: "Left for an exercise for the reader" is bad form, IMHO, in official documentation or a reference manual. It's fine for a student textbook, but it annoys me to no end when a writer thinks they are clever and then won't explain why they think they are clever. Hey, I'm stupid...sometimes I can't figure out the darn exercises! Teach me, or better, give hints.	[reply]