in reply to Re: Logging URLs that don't return 1 with $mech->success
in thread Logging URLs that don't return 1 with $mech->success

Limbic~Region,

Thanks very much!

My next step was to add "throttling" or what have you so that I'm not querying a given site inconsiderately. I didn't even really think of the DFS, that's a pretty neat idea! I'll play with this, and propose the idea to my employer.

I think that perhaps I could even extend this into something on the backend admin panel I'm surely going to be writing (for EVERYONE'S sanity) that could use an internal/external link boolean that would potentially make this more robust and with any luck fast(er).

Thanks again, that's a neat idea! :-)

meh.
  • Comment on Re^2: Logging URLs that don't return 1 with $mech->success

Replies are listed 'Best First'.
Re^3: Logging URLs that don't return 1 with $mech->success
by Limbic~Region (Chancellor) on Sep 12, 2008 at 13:13 UTC
    dhoss,
    Actually, I just realized you could have a monster on your hands without one more sanity check:
    # push @work, get_links($link); push @work, get_links($link) if ! off_site($link);
    I am sure somewhere on the university website there is a link off-site and you don't want to end up crawling the entire internet - it could take a while (and get you fired).

    Cheers - L~R