mascip has asked for the wisdom of the Perl Monks concerning the following question:
I don't manage to comment Improve your extracted transversal on chromatic's blog, so i'll ask my questions and remarks here, hoping to learn through discussion. You won't understand my questions if you don't read the original posts.
First, what would happen if we separated the process in 2 subroutines? For example (non-tested code) :
I feel like there would be both benefits and drawbacks:## Concatenates the texts retrieved in a node's (possibly nested) desc +endants # Note: preserve recursion with process_text_in($node) sub get_all_text_in { my $node = shift; # Concatenate the texts extracted from each child node my $text = reduce { $a .= process_text_in($b) } $node->content_list; } ## Get the text in a node, processing it if needed # Note: preserve recursion with get_all_text_in($node) sub process_text_in { my $node = shift; # Just text => get it return $text unless ref $node; # Not a special tag => get its children texts my $tag = $node->tag; return get_all_text_in($node) unless $action{$node->tag}; # Special tag => process it accordingly return $action{$tag}->($node); }
Having it all in one place feels nice and safe, but so does keeping a very simple logic. Maybe that for a recursion, keeping it all in one place is more critical.
My question is: in terms of cycles, what would be the consequence? Would my solution enable us to not need the "undef $traverse"?
Next related question: because both entries (a, p) in the %action hash start with $traverse->($node), it would be possible to take it out of the hash, which would solve one of the weak reference problems i think. (would it?) A good reason not to do that would be if you add some extra-entries in the hash, which wouldn't start by $traverse->($node). I am not familiar with HTML parsing enough, to know if this is likely to happen. Is it?
And finally, if the answer to the last question is yes, then are there other ways than using a hash? What if the hash contained references to named subroutines, which would call get_all_text_in($node)? Would we still have a leak? Sorry i'm not clear enough on how leaks work. What should i read?
Thank you for any insight or discussion.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Questions about Recursion and "Extract your transversal", by chromatic
by Anonymous Monk on Jun 07, 2013 at 00:46 UTC | |
by mascip (Pilgrim) on Jun 07, 2013 at 07:56 UTC | |
by mascip (Pilgrim) on Jun 10, 2013 at 17:13 UTC |