How to scrape data from ajax calls

shakuni has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to scrape data from ajax calls by Corion (Patriarch) on Jan 08, 2009 at 14:05 UTC
"Ajax" transfers the data over HTTP, just like regular web pages. So, just use whatever you use to scrape data from regular web pages. You will need to treat the results a bit differently - if JavaScript is returned, you will need to interpret it from your script. Alternatively, you can try to automate the website from the outside, by using, for example, Win32::IE::Mechanize, and then capturing the traffic using Sniffer::HTTP. Where exactly do you have problems?	[reply]
Re: How to scrape data from ajax calls by marto (Cardinal) on Jan 08, 2009 at 14:07 UTC
By 'google tasks' do you mean tasks within Google Calendar? If not that I have no idea what you mean (nor does google) however Using WWW::Selenium To Test Or Automate An Ajax Website from the tutorials section of this site may be of interest to you. Update: Fixed formatting typo. Martin	[reply]
Re^2: How to scrape data from ajax calls by shakuni (Initiate) on Jan 08, 2009 at 14:35 UTC
By google tasks, I mean this. I'm trying to auto login into gmail and then scrape all the tasks from there. Unsuccessful yet :(	[reply]
Re^3: How to scrape data from ajax calls by Limbic~Region (Chancellor) on Jan 08, 2009 at 15:42 UTC
shakuni, Ok, you have been given a hint (Using WWW::Selenium To Test Or Automate An Ajax Website). Why don't you start by saying what you have and have not accomplished. Successfully Loging Click the "tasks" link Find the tasks pop-up window Fetch the HTML of the tasks pop-up window Parse the HTML to the desired end In other words - show some effort and give us something more to go on than "it doesn't work". Cheers - L~R	[reply]
Re: How to scrape data from ajax calls by locked_user sundialsvc4 (Abbot) on Jan 08, 2009 at 14:48 UTC
Well, I see about 219 CPAN packages for “GMail” at search.cpan.org, and 548 for “Google,” so perhaps you could start there... Remember: “DRY = Don't Repeat Yourself.” In fact, don't repeat anyone in the world if you can help it. You can be absolutely sure that you are not the first person to have worked on getting useful information from Google or GMail. You can also be sure that, as soon as someone's put together a decent and general-purpose “way to do that,” it's going to show up on CPAN. Therefore, practical software-development in the Perl world consists very heavily of searching for, discovering, and then leveraging existing well-tested software assets from CPAN and other sources. Your task is surely no exception. There is absolutely nothing about “dealing with AJAX, either as a client or as a server,” that you must “invent.” This way of thinking does take some getting used to, because in the academic world “borrowing somebody else's work” is called “cheating.”