in reply to Scraping a website - various problems
I asked a very similar question yesterday. I was having trouble with the fact that the web page is written such that it only makes sense after a javascript engine has parsed it. I was told that LWP was giving me the same answer as File->SaveAs on the browser (not true). After that, I started looking into the JavaScript.pm package thinking I could perhaps run the source of the page through a javascript engine myself. I stopped there. It looked very daunting. Plus, in the end, I found a different way of solving the problem....
Where cleverness fails, use brute force.
However, as far as the page sometimes failing with no apparent reason....
I have had that happen. I have not been able to debug it, but it appears that when I hit a url thousands of times, it will occasionally puke. It doesn't seem to die permanently though. I put in a "fail this many times" clause and fixed my problem... (brute force over cleverness)
Stony