PS: I find $name =~ s#^http://##i; a bit "hard on the eyes". I sometimes use the | character, s|^http://||; but some folks object to that as the | normally means "or" in a regex. I think s[^http://][]; will work also?$mech->get($_); if (!$mech->success()) { ... failed somehow.. do something print "get of $_ failed!\n"; next; }
I don't know if these "not found" errors are transient or not. You can make use of the redo function to go back to the while() without re-evaluation (ie getting the next url) - this is like "next;" except that the while conditional is not re-evaluated. Of course you will need to structure the redo; within code some appropriate counter for max_retries so that you don't wind up in an infinite loop. But the first step would be to see if just skipping that URL like above will allow the code to complete. Then we can talk about "how to give it another chance".
BTW: It's been some months since we talked about this project. What lead you to go down the road of using Mechanize::Firefox? This adds an additional layer of complication to the whole thing - I'm for example having some version issue with Firefox and Mozrepl - so there are some "landmines" along this path.
Update:
If you add: $|=1; at the top of the code, this will un-buffer writes to STDOUT and make it easier to follow what the code is doing while it executes. If you don't do that, there is a long lag between the program printing and that output appearing on the screen because the typical buffer is ~4KB - many lines are "printed" by the program before they are "flushed" to the output. "flushing every print" has a performance impact, but in this case, it will make no difference at all.
In reply to Re: WWW::Mechanize::Firefox runs well: some attempts to make the script a bit more robust
by Marshall
in thread WWW::Mechanize::Firefox runs well: some attempts to make the script a bit more robust
by Perlbeginner1
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |