Re: Logging URLs that don't return 1 with $mech->success

# loop through the current page's links and use ->follow_link( text_re
+gex => q/$_/i to find and follow the current link
[download]

That would only follow the first link on that page matching some regex.

That may be what you want, but it reads as though you'd want to do something like:

for my $link ($mech->find_all_links) { # on this page
  $mech->get($link->url);
  unless ($mech->success) {
    warn "can't get ".$link->url.", status: ".$mech->status;
  }
  $mech->back;
}
[download]

"What should it profit a man, if he should win a flame war, yet lose his cool?"

Comment on Re: Logging URLs that don't return 1 with $mech->success Select or Download Code

Replies are listed 'Best First'.
Re^2: Logging URLs that don't return 1 with $mech->success by stonecolddevin (Parson) on Sep 11, 2008 at 00:10 UTC
Here's what I've come up with. Looks even uglier I think, but it looks like it worked. It just needs to skip over "mailto:" links, which is easy. Read more... (3 kB) meh.	[reply] [d/l]
Re^3: Logging URLs that don't return 1 with $mech->success by ikegami (Patriarch) on Sep 11, 2008 at 01:14 UTC
The top-level urls are processed differently than the links found at the urls, so it makes no sense to use the same "checked" hash for both types of urls. The following should be removed: `if ( $_ eq $checked_urls{$_} ) { print "Link checked, skipping\n"; next; } else` [download]	[reply] [d/l]
Re^4: Logging URLs that don't return 1 with $mech->success by stonecolddevin (Parson) on Sep 11, 2008 at 01:19 UTC
Thank, ikegami, I had been looking at that ad scratching my head over it. Two of the same `if` statements in two different places didn't really look right to me. meh.	[reply] [d/l]