in reply to Logging URLs that don't return 1 with $mech->success

# loop through the current page's links and use ->follow_link( text_re +gex => q/$_/i to find and follow the current link
That would only follow the first link on that page matching some regex.

That may be what you want, but it reads as though you'd want to do something like:

for my $link ($mech->find_all_links) { # on this page $mech->get($link->url); unless ($mech->success) { warn "can't get ".$link->url.", status: ".$mech->status; } $mech->back; }

Replies are listed 'Best First'.
Re^2: Logging URLs that don't return 1 with $mech->success
by stonecolddevin (Parson) on Sep 11, 2008 at 00:10 UTC

    Here's what I've come up with. Looks even uglier I think, but it looks like it worked. It just needs to skip over "mailto:" links, which is easy.

    meh.

      The top-level urls are processed differently than the links found at the urls, so it makes no sense to use the same "checked" hash for both types of urls.

      The following should be removed:

      if ( $_ eq $checked_urls{$_} ) { print "Link checked, skipping\n"; next; } else

        Thank, ikegami, I had been looking at that ad scratching my head over it. Two of the same if statements in two different places didn't really look right to me.

        meh.