I recently had a frustrating debugging session in which, for a program using LWP::Simple,

my $result = get($url);

was assigning undef to result, yet

getprint($url);

was getting and printing $url's HTML.

Digging into the code, I found out why.

getprint() (and getstore() and head()) drives HTTP::Request. But ever since libwww-perl 5.15, from November 6, 1997, get() doesn't, normally. LWP::Simple rolls its own super-lightweight HTTP::Request, _trivial_http_request, and that's what get() normally uses.

And while getprint() (etc.) uses a user agent like "LWP::Simple/5.79" (the number is libwww-perl's version) and protocol HTTP 1.1, _trivial_http_request uses "lwp-trivial/1.40" (LWP::Simple's version) and protocol HTTP 1.0. So a robots.txt that allows getprint() can forbid get().

If you're using a proxy (as determined by looking for the existence of an HTTP_PROXY environment variable), get() will use HTTP::Request. If _trivial_http_request gets an HTTP redirect, it'll switch to using HTTP::Request.

Or you can import $ua, the LWP::UserAgent object LWP::Simple uses, and, as a side effect, it'll guarantee that get() always drives HTTP::Request. Remember that if you're specifying a list to import, the module's @EXPORT list won't be exported by default -- it's now incumbent upon you to include all the names you want imported.

use LWP::Simple qw($ua get);

I'm writing a doc patch to make some of this clearer; the maintainer, Gisle Aas, has verified that importing $ua is the only officially supported technique to force get() to use HTTP::Request.

Updated: linkified module names.


In reply to LWP::Simple: a little more complicated than it sounds by Zed_Lopez

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.