in reply to Win32::IE::Mechanize not getting correct content

I don't know what you're trying to do with this task, and I know it doesn't help you solve the problem you're having ...

... but are you aware that there's a robots.txt on the site, specifically requesting that you not automate access to its contents?

(yes, I know, there are many differing interpretations of what types of programs should use the robots.txt -- for instance, if you manually started the process, but it just grabbed the pages to present them as one page for you, that's likely different than trying to retrieve the pages for pre-caching or a search engine spider)

Update: ikegami is right -- my browser put its insertion bar at the end, which I mistook for a /. (bah ... time to get my eyes checked again). They specifically allow robot access. Feel free to down vote my oversight.

  • Comment on Re: Win32::IE::Mechanize not getting correct content

Replies are listed 'Best First'.
Re^2: Win32::IE::Mechanize not getting correct content
by ikegami (Patriarch) on Mar 15, 2007 at 19:19 UTC
    Did the robots.txt change? Cause it currently doesn't disallow anything. (Disallow: / would.)
Re^2: Win32::IE::Mechanize not getting correct content
by cormanaz (Deacon) on Mar 15, 2007 at 19:01 UTC
    Yes I'm aware. The robots.txt file appears to be targeted at feedburner.com. And you're right, this doesn't help me solve my problem.