Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Scraping with Treebuilder

by HuckinFappy (Pilgrim)
on Jul 17, 2006 at 03:50 UTC ( #561651=note: print w/replies, xml ) Need Help??


in reply to Scraping with Treebuilder

The first error liverpole identified is easy to find if you turn on warnings (which you are not using in your code):
[10] perl -Mwarnings /tmp/testit.pl "my" variable @perlbooks masks earlier declaration in same scope at /t +mp/testit.pl line 50. Bareword "parent" not allowed while "strict subs" in use at /tmp/testi +t.pl line 15. syntax error at /tmp/testit.pl line 34, near ") return" Global symbol "@hrefs" requires explicit package name at /tmp/testit.p +l line 34. Execution of /tmp/testit.pl aborted due to compilation errors.
The error stopping you from even compiling though, is the missing semicolon. 13 years of writing perl, and I still find missing semicolons, parens and braces are the hardest errors to find sometimes. I use perltidy now to try and help. For example, running your code through perltidy, I end up with:
sub get_url { my $node = shift; my @hrefs = $node->look_down(_tag => 'a') return unless @hrefs; my $url = $hrefs[0]->attr('href'); $url =~ s/\s+$//; return $url; }
Well, that long line jumped right out at me as being seriously wrong, and it was easy to figure out the fix then.

HTH,
~Jeff

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://561651]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2023-06-08 21:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you go to conferences?






    Results (35 votes). Check out past polls.

    Notices?