in reply to Passing Values to a webpage and retrieving an answer

Well, I was not successful in easily using this Curl idea as the website suggested...they seem to have some admin/volunteer problems at the moment, although the concept sounds good. I just could not easily get Curl installed on my Windows machine.

This is different enough that I thought a new response was warranted rather than just an update to the previous response. I'm not sure if the "sentiment values" actually correspond to the graph on the website, but these results seem reasonable and seem to "jive" with the "sentiment meter" readings.

Here is some LWP code for you.

Figuring out (a) what URL to send the POST back to can be frustrating as well as (b) figuring out the parameters that it wants to have - with LWP you have to decide this from looking at the page source code that is sent to you (that may include Javascript, and this one does). Usually you don't have to run the Javascript yourself, just reply with what that script would do.

#!/usr/bin/perl -w use strict; use LWP::UserAgent; $|=1; my $DEBUG =0; my $url = "http://sentimentanalyzer.appspot.com/"; my $ua = LWP::UserAgent->new or die "New UserAgent Failed"; # get the main page ... this works ... # my $response = $ua->get($url) or die "Problem getting $url\n"; $response->is_success or die "Failed to GET '$url': ", $response->status_line; my $html_page = $response->content( ); print $html_page if $DEBUG; # The trouble starts with figuring out the correct stuff to # POST back to the website... # You have to know what URL to post back to - (not necessarily # exactly where it came from) and also what to send in the POST! # This site does not appear to use cookies # The URL to POST back to is slightly different than the # main URL - here is not "dynamic" and probably this works # even without retrieving the main site - When SSL and cookies # get involved, it can be more complicated. foreach my $content ("A dog not like a cat.", "This is bogus!", "I really love my grandmother.", ) { $response = $ua->post( 'http://sentimentanalyzer.appspot.com/api/classify', [ 'content' => "$content", 'value' => "Submit", 'lang' => 'en', ], ); $response->is_success or die "Error: ", $response->status_line; $html_page = $response->content(); print $html_page if $DEBUG; #This is "Very Ugly", but appears to work my ($score) = $html_page =~ /\{\"score\":(.*)\s*\}/; print "$score $content\n"; } __END__ 0.003720114371614687 A dog not like a cat. 0.81868251185499441 This is bogus! 0.90770799032922367 I really love my grandmother.
Update:

Firefox has moved some of this stuff around. To view the page source sent to the browser: Tools|Web Developer|Page Source. There are other tools like HTTPfox and others that I don't know how to use well yet, but other Monks will!

Replies are listed 'Best First'.
Re^2: Passing Values to a webpage and retrieving an answer
by nateg (Initiate) on May 01, 2012 at 15:31 UTC

    Thank you! It works great!

    How did you know how to correctly write the:

    $response = $ua->post( 'http://sentimentanalyzer.appspot.com/api/classify', [ 'content' => "$content", 'value' => "Submit", 'lang' => 'en', ], );

      I looked at the source code to the webpage. Then used a combination of guesswork, hacking and luck. I'm no guru at reading HTML page code. Basically I was looking for what actually happens with the "submit" button is pressed by the user. Other things happen like enforcing the 25,000 character limit which I didn't bother with.

      When experimenting with my browser and examining the returned page source for a query, I noticed that this "sentiment" value is actually on the last line of the returned page and I wrote an ugly little hack to get that value. Of course this is "fragile", but then again it only took me a couple of minutes to do.

      WWW::Mechanize can do a lot of this low level work for you, but I'm not as familiar with it. There is also a command line tool that can help build Mechanize scripts but I've forgotten the exact name of that thing right now. But I did play with some time ago and thought it was pretty cool, its written by Corion. I don't do these scripts very often. These are higher level tools that are probably a better way for you to start with. I'd google around for Mechanize and look at some of the cookbooks, etc. I think there are also various Firefox tools that can give you the contents of what the browser posts back to the server - but I didn't need them for this simple project.

      I tried the suggestion given on the webpage (Curl) and it didn't work out so well, so I tried another way to give you some sort of a workable answer. I don't claim that it is the "best" answer, but it is an answer. Thanks for giving the URL that you were dealing with - actual code would not have been possible without that!

      This is just a few snippets from the page source... take a look at the full thing with your browser if you are interested...here are a few places to focus your attention.

      function doSubmit() {} the post back URL is in here... futher down some HTML code... looking for what happens when the "submit" button is pressed <input type="button" id="submit" name="submit" value="Submit" onclick= +"doSubmit()"> other code gives the language options...
      Update: Now that you have a working example using lower level LWP, see if you can replicate the functionality with Mechanize. This stuff can get very tricky depending upon the website! I think it took me a whole week to get my first LWP program working! In terms of complexity, this site is one of the "easier" ones. However, expect to spend a considerable number of hours working on your first one.