Keep It Simple, Stupid | |
PerlMonks |
Re: problem parsing htmlby spx2 (Deacon) |
on Jul 15, 2009 at 15:12 UTC ( [id://780354]=note: print w/replies, xml ) | Need Help?? |
My experience with HTML::TreeBuilder showed me that I should use HTML::TreeBuilder::XPath which of course I would recommend you use also. ( HTML::TreeBuilder is nice but HTML::TreeBuilder::XPath is sufficiently abstract in order to be elegant ) Let's see how your code would look if you would use above mentioned module :
EDIT:small adjustments OUTPUT:
First of all we have simplified the code from 33 lines to 15 lines. Second , we maintained the meaning of the code , which was "Give me the a tags which have attribute href which matches regex "&chain=(\w)" so that the a tags have a parent tag td. Now take those a tags and apply a regex on their href attribute and take the word after the chain= substring". That is exactly what this XPath query says => //td//a[contains(\@href,'chain=')] I have also used this Firefox addon to check that my XPath was right. If you're interested in reading more about XPath read here and here. You also have a mistake in your code , you delete the HTML::TreeBuilder object after the first iteration of the for loop.
In Section
Seekers of Perl Wisdom
|
|