jlongino has asked for the wisdom of the Perl Monks concerning the following question:

This is the first time that I've used LWP to process some simple requests. Here's the code:
use strict; require LWP::UserAgent; my $ua = LWP::UserAgent->new (env_proxy => 1, keep_alive => 1, timeout => 30); while(<DATA>) { chomp; my $req = HTTP::Request->new('GET', $_); my $resp = $ua->request($req); if ($resp->is_success) { # print $resp->content; print "OK -----> '", $_, "'\n"; } else { print "FAILED -> '" ,$_, "'\n"; } select((select(STDOUT), $| = 1)[0]); #flush STDOUT buffer } print "Finished.\n" __DATA__ http://www.thehungersite.com/cgi-bin/WebObjects/CTDSites.woa/60/wo/SJ5 +0004g800Ig400Xz/0.0.33.13.0.1.0.0.0.CustomContentActiveImageDisplayCo +mponent.0.0.0 http://www.thebreastcancersite.com/cgi-bin/WebObjects/CTDSites.woa/60/ +wo/SJ50004g800Ig400Xz/2.0.33.13.0.1.0.1.0.CustomContentActiveImageDis +playComponent.0.0.0 http://www.therainforestsite.com/cgi-bin/WebObjects/CTDSites.woa/60/wo +/SJ50004g800Ig400Xz/5.0.33.13.0.1.0.0.0.CustomContentActiveImageDispl +ayComponent.0.0.0 http://www.ecologyfund.com/registry/ecology/03_donate.html?noheader=-1 http://www.ecologyfund.com/registry/ecology/donate_pol.html?noheader=- +1 http://www.ecologyfund.com/registry/ecology/05_donate.html?noheader=-1 http://www.ecologyfund.com/registry/ecology/07_donate.html?noheader=-1 http://www.ecologyfund.com/registry/ecology/04_donate.html?noheader=-1 http://www.ecologyfund.com/registry/ecology/01_donate.html?noheader=-1 http://www.ecologyfund.com/registry/ecology/08_donate.html?noheader=-1 http://www.ecologyfund.com/registry/ecology/02_donate.html?noheader=-1
As you can see I'm not really interested in any returned content, I just want to automate several button clicks (each URL is an action performed by a button press) and see that no error was encountered.

Some questions:

Any suggestions welcomed as I plan to use LWP more and want to go about it correctly and with as few headaches as possible. Thanks.

--Jim

Replies are listed 'Best First'.
(jeffa) Re: Simple requests using LWP::UserAgent
by jeffa (Bishop) on Dec 06, 2001 at 00:00 UTC
    1) Leaving the instantiation of LWP::UserAgent outside of the while loop is better because it is a reusable object. Multiple instantiations are unecessary and wasteful.

    2) just add $| = 1; to the top of your code and be done with it. Normally, buffering is not an issue with most scripts, but sophisticated network I/O should turn buffering off (same as auto-flushing on) because you don't won't to be stuck waiting for your data to be sent to the server.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    F--F--F--F--F--F--F--F--
    (the triplet paradiddle)
    
(ar0n) Re: Simple requests using LWP::UserAgent
by ar0n (Priest) on Dec 06, 2001 at 00:32 UTC
    You might want to check out HTTP::Request::Common as an easy way to build HTTP requests, e.g.:
    use HTTP::Request::Common qw/POST GET/; my $req = POST $url, [ foo => "blah", bar => "hurg", stumbit => "sexisgood" ]; if ($lwp->request($req)->is_success()) { # ... }

    [ ar0n -- want job (boston) ]

Re: Simple requests using LWP::UserAgent
by blakem (Monsignor) on Dec 06, 2001 at 01:24 UTC
Re: STDOUT flush, Was: Simple requests using LWP::UserAgent
by baku (Scribe) on Dec 06, 2001 at 02:32 UTC

    The STDOUT is a buffered handle. Buffered I/O is the typical on Unix systems - it saves the system time from having to start I/O, perform I/O, and stop I/O as often, by waiting for a certain amount of I/O to be "ready" before doing it. Imagine, for example, if your printer was warmed up, told to position the print head, write a single character, then told to reset the print head, every time you did a print LPR 'x' - a very bad thing. On a smaller scale, this applies also to disc I/O, network communications, &c. - there's almost always a "cost" to starting and stopping I/O.

    However, when you're working with a "tty," it's pretty unlikely to be an actual TeleType(TM) these days. What I tend to do, is add a chunk like this up front:

    use FileHandle; # Maybe should be File::IO ? if ( -t STDOUT ) { STDOUT->autoflush(1); } # OR: without FileHandle, save some RAM at the expense # of legibility: # braces keep "my" vars local here. { # Save the currently select:ed filehandle! my $saved_selected_fh = select STDOUT; if ( -t ) { ++$|; # turn on "autoflush" } }

    N.B. that your "flush" code is not a flush (see  perldoc on  flush) -- it's actually toggling on the autoflush mode on STDOUT. That's a fancy way of saying that Perl will automatically  flush the STDOUT buffer - i.e. tell the OS to do the I/O right now, don't wait for more data - every time you do a  print or other write to it.

    Since TTY or disc I/O is so fast, I'd recommend just sticking a  ++$| (turn on autoflush on the currently-selected output channel, which is STDOUT unless you've done a  select; the current channel is the one that  print LIST will print to, as opposed to  print FILEHANDLE LIST) at the beginning, rather than re-enabling autoflush for each line of data. It's not a big performance hit, but it is a bit of overkill.

    If you really do want to manually control when the buffer is flushed (barring Unix getting all smart and surprising you by doing it first ... it can if it wants to ), just put a  flush STDOUT; in there.

    Just for reference, on most unices the buffer for a tty is about 4Kbytes. If you changed a  print to produce, say, 4097 (4 * 1024 + 1) bytes of output, it would probably trigger a flush right off...