Simple requests using LWP::UserAgent

jlongino has asked for the wisdom of the Perl Monks concerning the following question:

This is the first time that I've used LWP to process some simple requests. Here's the code:

use strict;
require LWP::UserAgent;

my $ua = LWP::UserAgent->new (env_proxy  => 1,
                              keep_alive => 1,
                              timeout    => 30);

while(<DATA>) {
   chomp;
   my $req = HTTP::Request->new('GET', $_);
   my $resp = $ua->request($req);
   if ($resp->is_success) {
      # print $resp->content;
      print "OK -----> '", $_, "'\n";
   } else {
      print "FAILED -> '" ,$_, "'\n";
   }
   select((select(STDOUT), $| = 1)[0]); #flush STDOUT buffer
}
print "Finished.\n"

__DATA__
http://www.thehungersite.com/cgi-bin/WebObjects/CTDSites.woa/60/wo/SJ5
+0004g800Ig400Xz/0.0.33.13.0.1.0.0.0.CustomContentActiveImageDisplayCo
+mponent.0.0.0
http://www.thebreastcancersite.com/cgi-bin/WebObjects/CTDSites.woa/60/
+wo/SJ50004g800Ig400Xz/2.0.33.13.0.1.0.1.0.CustomContentActiveImageDis
+playComponent.0.0.0
http://www.therainforestsite.com/cgi-bin/WebObjects/CTDSites.woa/60/wo
+/SJ50004g800Ig400Xz/5.0.33.13.0.1.0.0.0.CustomContentActiveImageDispl
+ayComponent.0.0.0
http://www.ecologyfund.com/registry/ecology/03_donate.html?noheader=-1
http://www.ecologyfund.com/registry/ecology/donate_pol.html?noheader=-
+1
http://www.ecologyfund.com/registry/ecology/05_donate.html?noheader=-1
http://www.ecologyfund.com/registry/ecology/07_donate.html?noheader=-1
http://www.ecologyfund.com/registry/ecology/04_donate.html?noheader=-1
http://www.ecologyfund.com/registry/ecology/01_donate.html?noheader=-1
http://www.ecologyfund.com/registry/ecology/08_donate.html?noheader=-1
http://www.ecologyfund.com/registry/ecology/02_donate.html?noheader=-1
[download]

As you can see I'm not really interested in any returned content, I just want to automate several button clicks (each URL is an action performed by a button press) and see that no error was encountered.

Some questions:

Is it OK to leave the LWP::UserAgent->new statement outside of the while loop as I have done or should it go inside? It seems to work the same either way.
I also noticed that before I added the STDOUT buffer flush, the "OK" and "FAILED" messages were printed all at once (apparently, not as each is completed). Is this behaviour to be expected whenever you use LWP or have I hit an unusual circumstance?

Any suggestions welcomed as I plan to use LWP more and want to go about it correctly and with as few headaches as possible. Thanks.

--Jim

Comment on Simple requests using LWP::UserAgent Select or Download Code

Replies are listed 'Best First'.
(jeffa) Re: Simple requests using LWP::UserAgent by jeffa (Bishop) on Dec 06, 2001 at 00:00 UTC
1) Leaving the instantiation of LWP::UserAgent outside of the while loop is better because it is a reusable object. Multiple instantiations are unecessary and wasteful. 2) just add `$\| = 1;` to the top of your code and be done with it. Normally, buffering is not an issue with most scripts, but sophisticated network I/O should turn buffering off (same as auto-flushing on) because you don't won't to be stuck waiting for your data to be sent to the server. jeffa L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR F--F--F--F--F--F--F--F-- (the triplet paradiddle)	[reply] [d/l]
(ar0n) Re: Simple requests using LWP::UserAgent by ar0n (Priest) on Dec 06, 2001 at 00:32 UTC
You might want to check out HTTP::Request::Common as an easy way to build HTTP requests, e.g.: `use HTTP::Request::Common qw/POST GET/; my $req = POST $url, [ foo => "blah", bar => "hurg", stumbit => "sexisgood" ]; if ($lwp->request($req)->is_success()) { # ... }` [download] [ ar0n -- want job (boston) ]	[reply] [d/l]
Re: Simple requests using LWP::UserAgent by blakem (Monsignor) on Dec 06, 2001 at 01:24 UTC
I just want to automate several button clicks That sounds like a job for webchat (aka WWW::Chat) -Blake	[reply]
Re: STDOUT flush, Was: Simple requests using LWP::UserAgent by baku (Scribe) on Dec 06, 2001 at 02:32 UTC
The STDOUT is a buffered handle. Buffered I/O is the typical on Unix systems - it saves the system time from having to start I/O, perform I/O, and stop I/O as often, by waiting for a certain amount of I/O to be "ready" before doing it. Imagine, for example, if your printer was warmed up, told to position the print head, write a single character, then told to reset the print head, every time you did a `print LPR 'x'` - a very bad thing. On a smaller scale, this applies also to disc I/O, network communications, &c. - there's almost always a "cost" to starting and stopping I/O. However, when you're working with a "tty," it's pretty unlikely to be an actual TeleType(TM) these days. What I tend to do, is add a chunk like this up front: `use FileHandle; # Maybe should be File::IO ? if ( -t STDOUT ) { STDOUT->autoflush(1); } # OR: without FileHandle, save some RAM at the expense # of legibility: # braces keep "my" vars local here. { # Save the currently select:ed filehandle! my $saved_selected_fh = select STDOUT; if ( -t ) { ++$\|; # turn on "autoflush" } }` [download] N.B. that your "flush" code is not a flush (see `perldoc` on `flush`) -- it's actually toggling on the autoflush mode on STDOUT. That's a fancy way of saying that Perl will automatically `flush` the STDOUT buffer - i.e. tell the OS to do the I/O right now, don't wait for more data - every time you do a `print` or other write to it. Since TTY or disc I/O is so fast, I'd recommend just sticking a `++$\|` (turn on autoflush on the currently-selected output channel, which is STDOUT unless you've done a `select`; the current channel is the one that `print LIST` will print to, as opposed to `print FILEHANDLE LIST`) at the beginning, rather than re-enabling autoflush for each line of data. It's not a big performance hit, but it is a bit of overkill. If you really do want to manually control when the buffer is flushed (barring Unix getting all smart and surprising you by doing it first ... it can if it wants to ), just put a `flush STDOUT;` in there. Just for reference, on most unices the buffer for a tty is about 4Kbytes. If you changed a `print` to produce, say, 4097 (4 * 1024 + 1) bytes of output, it would probably trigger a flush right off...	[reply] [d/l] [select]