nateg has asked for the wisdom of the Perl Monks concerning the following question:

I am very new to Perl so please be gentle.. I'm trying to insert a text to the website http://sentimentanalyzer.appspot.com/, activate a submit button and then retrieve a value from a variable. I tried fetching the site and update the textbox like this:

my $response = $ua->get('http://sentimentanalyzer.appspot.com','input' +=>$te);
but it didn't work. I also tried activating the button there but failed.

Thank you for your help!

Replies are listed 'Best First'.
Re: Passing Values to a webpage and retrieving an answer
by Corion (Patriarch) on Apr 30, 2012 at 22:00 UTC

    So, how did your code fail for you? What exact code did you use? Please post a small, self-contained example that reproduces the problem. The code you've shown does not tell us what modules you are using. It helps us to help you better if you show us a short but complete program and tell us what output you get and what output you expect.

      I tried using www:Mechanize to do it (after searching the web). What i have so far is:

      my $ua = LWP::UserAgent->new; $ua->env_proxy; $te='hallo hallo hallo'; print $te; my $response = $ua->get('http://sentimentanalyzer.appspot.com','in +put'=>$te); if ($response->is_success) { #In here i want to retrieve a value from the site after # inserting the text and hitting the "submit" button. }

      The textarea in the site is named "input".

      I also didn't figure out how to activate the button..

        So, where in that script do you use WWW::Mechanize?

        Also, again, please do post a complete script, together with the output you get and the output you expect. The script you posted above does not run, because it does not load any of the needed modules.

Re: Passing Values to a webpage and retrieving an answer
by Marshall (Canon) on May 01, 2012 at 01:47 UTC
    Well, I was not successful in easily using this Curl idea as the website suggested...they seem to have some admin/volunteer problems at the moment, although the concept sounds good. I just could not easily get Curl installed on my Windows machine.

    This is different enough that I thought a new response was warranted rather than just an update to the previous response. I'm not sure if the "sentiment values" actually correspond to the graph on the website, but these results seem reasonable and seem to "jive" with the "sentiment meter" readings.

    Here is some LWP code for you.

    Figuring out (a) what URL to send the POST back to can be frustrating as well as (b) figuring out the parameters that it wants to have - with LWP you have to decide this from looking at the page source code that is sent to you (that may include Javascript, and this one does). Usually you don't have to run the Javascript yourself, just reply with what that script would do.

    #!/usr/bin/perl -w use strict; use LWP::UserAgent; $|=1; my $DEBUG =0; my $url = "http://sentimentanalyzer.appspot.com/"; my $ua = LWP::UserAgent->new or die "New UserAgent Failed"; # get the main page ... this works ... # my $response = $ua->get($url) or die "Problem getting $url\n"; $response->is_success or die "Failed to GET '$url': ", $response->status_line; my $html_page = $response->content( ); print $html_page if $DEBUG; # The trouble starts with figuring out the correct stuff to # POST back to the website... # You have to know what URL to post back to - (not necessarily # exactly where it came from) and also what to send in the POST! # This site does not appear to use cookies # The URL to POST back to is slightly different than the # main URL - here is not "dynamic" and probably this works # even without retrieving the main site - When SSL and cookies # get involved, it can be more complicated. foreach my $content ("A dog not like a cat.", "This is bogus!", "I really love my grandmother.", ) { $response = $ua->post( 'http://sentimentanalyzer.appspot.com/api/classify', [ 'content' => "$content", 'value' => "Submit", 'lang' => 'en', ], ); $response->is_success or die "Error: ", $response->status_line; $html_page = $response->content(); print $html_page if $DEBUG; #This is "Very Ugly", but appears to work my ($score) = $html_page =~ /\{\"score\":(.*)\s*\}/; print "$score $content\n"; } __END__ 0.003720114371614687 A dog not like a cat. 0.81868251185499441 This is bogus! 0.90770799032922367 I really love my grandmother.
    Update:

    Firefox has moved some of this stuff around. To view the page source sent to the browser: Tools|Web Developer|Page Source. There are other tools like HTTPfox and others that I don't know how to use well yet, but other Monks will!

      Thank you! It works great!

      How did you know how to correctly write the:

      $response = $ua->post( 'http://sentimentanalyzer.appspot.com/api/classify', [ 'content' => "$content", 'value' => "Submit", 'lang' => 'en', ], );

        I looked at the source code to the webpage. Then used a combination of guesswork, hacking and luck. I'm no guru at reading HTML page code. Basically I was looking for what actually happens with the "submit" button is pressed by the user. Other things happen like enforcing the 25,000 character limit which I didn't bother with.

        When experimenting with my browser and examining the returned page source for a query, I noticed that this "sentiment" value is actually on the last line of the returned page and I wrote an ugly little hack to get that value. Of course this is "fragile", but then again it only took me a couple of minutes to do.

        WWW::Mechanize can do a lot of this low level work for you, but I'm not as familiar with it. There is also a command line tool that can help build Mechanize scripts but I've forgotten the exact name of that thing right now. But I did play with some time ago and thought it was pretty cool, its written by Corion. I don't do these scripts very often. These are higher level tools that are probably a better way for you to start with. I'd google around for Mechanize and look at some of the cookbooks, etc. I think there are also various Firefox tools that can give you the contents of what the browser posts back to the server - but I didn't need them for this simple project.

        I tried the suggestion given on the webpage (Curl) and it didn't work out so well, so I tried another way to give you some sort of a workable answer. I don't claim that it is the "best" answer, but it is an answer. Thanks for giving the URL that you were dealing with - actual code would not have been possible without that!

        This is just a few snippets from the page source... take a look at the full thing with your browser if you are interested...here are a few places to focus your attention.

        function doSubmit() {} the post back URL is in here... futher down some HTML code... looking for what happens when the "submit" button is pressed <input type="button" id="submit" name="submit" value="Submit" onclick= +"doSubmit()"> other code gives the language options...
        Update: Now that you have a working example using lower level LWP, see if you can replicate the functionality with Mechanize. This stuff can get very tricky depending upon the website! I think it took me a whole week to get my first LWP program working! In terms of complexity, this site is one of the "easier" ones. However, expect to spend a considerable number of hours working on your first one.
Re: Passing Values to a webpage and retrieving an answer
by Marshall (Canon) on Apr 30, 2012 at 22:23 UTC
    The type of response that you are going to get is the webpage source code - that's the code that your browser would run to present the user display. The LWP family of routines essentially emulates a browser (the server cannot tell the difference). So,
    my $response = $ua->get('http://sentimentanalyzer.appspot.com');
    would suffice for that purpose. This "input=>te" means nothing. In order to send a response to this server, you need to use the POST method to send your text back (based upon the received webpage) and specify the action. This will have the effect of clicking on the "submit" button.

    While looking at this page, I learned of this "curl" freeware program. They give an example of how to use it with their page. That looks pretty straight-forward to me for this simple application. But I have never used Curl.

    If this is all you need to do, I would follow their suggestion and use this freeware Curl thing. I've written a number of these LWP programs and often spent hours or even days figuring out the right "incantation".

    Anyway, if you just need this one website's function, I would install and run Curl, my $result = `curl ...whatever...` . Get that working, then tackle a more general solution with Perl's various LWP methods including of course LWP and Mechanize - this can get very complicated!

    I would concentrate on making your first Perl web program a success rather than having to know/learn a lot of sophisticated stuff that is not necessary for the job at hand.

    Update: Just tried this advice, and although it sounded good from the website, their main mirror site is having troubles now and I was unable to download a "unzip and go, i.e. a pre-built" Windows version of this thing. So evidently this may take more work than the main website claims.

Re: Passing Values to a webpage and retrieving an answer
by JavaFan (Canon) on Apr 30, 2012 at 22:19 UTC
    The additional arguments to get set HTTP headers. They don't mangle the URL to set HTML form parameters.

      How can i use it correctly with URL parameters?