Popcorn Dave has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I'm starting to play around with the HTTP and LWP modules and I'm running in to some small problems.

My problems are with the HTTP::Headers and the output I'm getting from the code I've written.

With the HTTP::Headers module, exactly how many of the variables do I need to specify? For the web site I'm trying to get the info off of, it has an applet that is looking for the root page as a referral. I tested this theory in Opera and it wouldn't work without the referral being sent.

My other problem is that when I do actually get information out of the script off other sites, it's printing out the address of the hash, not the content.

My code is as follows:

#!/usr/bin/perl -w use strict; use LWP::UserAgent; use HTTP::Request::Common; use HTTP::Headers; my @list; my ($name, $url, $pic); my $refurl = 'http://www.kingfeatures.com'; my %cartoon; open FH, ">zippy.gif"; binmode FH; # necessary to convert pictures streamed to graphics my $bot = new LWP::UserAgent; my $stream = new HTTP::Request( GET => "http://www.kingfeatures.com/fe +atures/comics/mutts/aboutMaina.php" # Referer => $refurl ); $pic = $bot->request($stream); print "$pic->content($stream)\n" if $pic->is_success; print FH $pic; close FH;

I'd appreciate any light anybody could shed on this and if anyone knows of anything other than the module docs that would give me a bit more guidance.

There is no emoticon for what I'm feeling now.

Replies are listed 'Best First'.
Re: HTTP::Request questions
by blokhead (Monsignor) on Oct 26, 2002 at 07:30 UTC
    I think your Referer usage is ok (although I usually do it a bit differently), but right now all you are doing is creating a request object, and not executing the request! To do that you need to use a LWP::UserAgent object to query the website.... Here's part of a template I use whenever I need to fetch a URL, which you may or may not find useful.
    use HTTP::Request::Common; use LWP::UserAgent; use HTTP::Headers; my $refer = 'http://www.kingfeatures.com'; my $url = 'http://www.kingfeatures.com/stuff...'; my $agent = LWP::UserAgent->new(keep_alive => 1, timeout => 10, ); # change this to masquerade as a browser if the site requires it. my $hdrs = new HTTP::Headers(User-Agent => 'My friendly LWP script') +; $hdrs->referrer($refer); my $req = new HTTP::Request(GET => url, $hdrs); my $response = $agent->request($req); # $response is a HTTP::Response object. if ($response->is_success) { # you now have access to info like $response->headers # print 'Content-type: ' . $response->headers->content_type . "\ +n\n"; # but what you probably want is in here: print $response->content; }
    Update: OK, so I didn't see your declaration of $bot as a LWP::UserAgent there. However, I've never seen a call like this one that you have: $response->content($req). What does passing the HTTP::Request object to the content method do, if anything? Also notice that you're printing $pic to the zippy.gif file, but $pic is only a blessed hash reference. This is probably not what you want. Do you want to write $pic->content to the filehandle instead?

    blokhead

      Thanks for that!

      I'm trying to grab graphics (comics) off that particular web site and I'm just sort of starting to do this type of web page requesting. I've done things with LWP::Simple before but nothing that needed a referral page.

      I'm still trying to determine exactly what the minimum you need in HTTP::Headers is and how you tell what a page is looking for, or is that just by adding info until it works? I couldn't find anything in the module docs that said "You need this, this, and this at the very least"

      There is no emoticon for what I'm feeling now.