in reply to Re: Scraping an ASP form I don't have any control over
in thread Scraping an ASP form I don't have any control over

Thanks, I'll take a close look at what you've provided. My current solution works for just the one piece of data I'm pulling, but I can do so much more with the data I can retrieve.

  • Comment on Re^2: Scraping an ASP form I don't have any control over

Replies are listed 'Best First'.
Re^3: Scraping an ASP form I don't have any control over
by huck (Prior) on Apr 16, 2017 at 01:31 UTC

    If what you have via WWW::Mechanize is good for you thats just fine tho. I just thought you might find it of use to see a more nuts and bolts method of doing it too. A closer to "core" method.

    One thing i do like about my method is the dumper file with all of the request, not just the content. It helps in figuring out what is going on. I dont "mech", but a way of changing $mech->save_content("formfile.txt"); to something like $mech->save_response("formfile.txt");, the content is in the response. Related is the ability to rerun off the saved files, so you can fix an intermediate part locally with temporary prints without having to go the the website again.

Re^3: Scraping an ASP form I don't have any control over
by stevieb (Canon) on Apr 16, 2017 at 00:02 UTC

    If it's not in a proper/standard format, and the return isn't documented, be prepared to modify your code every single time the remote site decides to change their minds...

      That is true even with an API tho, Im on my third rewrite because of APIs at youtube, but then i think there were at least half a dozen when i was just scrapeing the channels video indexes for my stats before there even was an api.

        Agreed. That said, typically though, someone producing a real API will have a version number associated with it (v1, v2 etc). If you're not getting JSON, XML or some other form of structured data, or you're having to change code due to undocumented or unannounced changes made by the site in question, they suck. Period.

        This problem is not related to Perl even... this is across the board. If there's an API, keep it consistent, and don't make major changes without a bump in revision. If you're attempting to offer up data, document it. Do not randomly change stuff. If you're a user of a site that does not have a documented data format or a documented API, bitch at them, or go with a different site.