I'd suggest something like:
my @urls = ($url); my @next_set; while (@urls) { # Storage for links my @next_set; for my $url (@urls) { # Get the page & such $mechanize->get($url); my $page = $mechanize->content; my $title = $mechanize->title; print "<b>$title</b><br />"; # add the pages links to the next list push @next_set, $mechanize->links; } # OK, we've processed all in @urls, so load @urls # with all the links we've found since last time, # and start over again. @urls = @next_set; }
I use two arrays here because it's generally not a good idea to modify an array you're iterating on. So we basically use the second array as a bucket to hold all the links we find while processing the first array. Then, when we finish the first array, we reload it and start again.
...roboticus
When your only tool is a hammer, all problems look like your thumb.
In reply to Re: Building a Spidering Application
by roboticus
in thread Building a Spidering Application
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |