in reply to Scrape "generated" content from secure site?

Some sites use dirty tricks like setting some cookies when you GET some picture (or css file). So you must investigate server response in LiveHTTPHeaders for example.

Also some site waiting for x and y coordinates when you click a button...

  • Comment on Re: Scrape "generated" content from secure site?

Replies are listed 'Best First'.
Re^2: Scrape "generated" content from secure site?
by hackerkatt (Initiate) on Apr 27, 2009 at 12:16 UTC
    @Gangbass, How can I be sure of the cookie I'm getting/saving is being used by Mech? I see in Firebug this cookie getting stored.
    151318053.1240690172.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
    And this script snipped from the page with the link to my final destination looks like it could have some bearing. I'm trying to save the minimized js for inspection, but having a problem getting all of it. Any thoughts on what you see here? Perhaps some common code used?
    <script type="text/javascript"> var pageTracker = _gat._getTracker("UA-3875979-1"); pageTracker._initData(); pageTracker._trackPageview(); </script>

      This is Google Analytics Tracking Code. So i don't think this is what you want...

      Early you said what this site work w/o JavaScript... Did you clear cookies before testing it?

      If this is JavaScript when you just need to check there it set cookies and implement this logic in your WWW::Mechanize.

        Actually, I didn't clear cookies before I tested. I'll try again.