jmarans has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to retrieve the first 4k of a file from a remote web server using LWP and HTTP, and this code frag will only return the entire file. I've just pulled the latest HTTP::Request from CPAN.

my $request = HTTP::Request->new(GET => "$url/$file"); my $response = $self->{ua}->simple_request($request, "/tmp/$file", 409 +4);

Replies are listed 'Best First'.
Re: How do I retrieve a piece of a file? (use a callback)
by grinder (Bishop) on Nov 24, 2001 at 03:38 UTC

    You could hang a callback off the request, and die after you've seen 4096 characters. Your script won't actually die, you'll just return back from the request method. Something like this:

    use HTTP::Request; use LWP::UserAgent; my $html = ''; my $request = HTTP::Request->new(GET => "$url/$file" ); my $ua = LWP::UserAgent->new; my $response = $ua->request($request, \&cb); sub cb { $html .= $_[0]; die if length($html) > 4096; }

    Note that you might want to trim the $html variable back to exactly 4096 chars with substr($html, 0, 4096). Also, you may have asked the question because you know that what you are looking for is somewhere within the first 4k. Of course, if you find what you are looking for earlier, then you can die all that much earlier.

    Note that this is about as efficient as it gets; you are getting chunks of the page more or less as they are peeled off the socket and then dealing with them on the fly.

    Hmm... I never noticed the $self->{ua} thing before. I'll have to take a closer look at that... or hang on, is that just part of a larger object?

    --
    g r i n d e r
      Silly grinder, length is for kids :D (request takes an additional arg, which is, bytesize)
      use HTTP::Request; use LWP::UserAgent; my $html = ''; my $request = HTTP::Request->new(GET => "$url/$file" ); my $ua = LWP::UserAgent->new; my $response = $ua->request($request, \&cb, 4096); sub cb { print $_[0]; die; }

       
      ___crazyinsomniac_______________________________________
      Disclaimer: Don't blame. It came from inside the void

      perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

        Yeah, but the person said that trying it that way didn't work as expected. I was giving her/him another way of doing it.

        Correct me if I'm wrong, but in your example if the remote site is (for instance) heavily loaded and sends you a chunk of (for example) 160 bytes, your callback will be called, die, and close the connection prematurely.

        later: I suppose what I am talking about is this in another form.

        --
        g r i n d e r
      The $self->{ua} is there because I hacked out _init() from the original .pm, and made it one of my methods so I could watch a bit closer. I like your call back idea, though. It means I don't have to understand why a server I thought is 1.1 compliant isn't. I probably need less than 4k. It was an arbitrary length I chose to transfer; tar will pull out what I need and complain to dev null about the rest. So, this response taught me something useful, and your subsequent posting looks amazing. Thanks.
Re: How do I retrieve a piece of a file?
by ask (Pilgrim) on Nov 24, 2001 at 05:10 UTC
    If the web server support ranges, you should be able to add a HTTP header like
    Range: bytes=1-4096
    to make it just return the first 4KB.

     - ask

    -- 
    ask bjoern hansen, http://ask.netcetera.dk/   !try; do();
    
Re: How do I retrieve a piece of a file?
by tadman (Prior) on Nov 24, 2001 at 03:42 UTC
    If the request is not going as planned, perhaps due to missing HTTP/1.1 method support on the server side, there are two ways to solve this issue.

    First, if all you want is the first 4096 bytes, but don't mind downloading the extra, you can use substr to extract the required data.

    Otherwise, if you are using only the first 4K of a really large file, if you are able to, deploy a quick CGI on the server-side which will send you the first 4K. Make sure to keep it either simple (i.e. only sends 4K of a particular file), or secure (i.e. HTTP authentication, or such).

    A more brute-force approach would be to just terminate the connection after you get what you need.
Re: How do I retrieve a piece of a file?
by belg4mit (Prior) on Nov 24, 2001 at 03:06 UTC
    This isn't really a Perl question.
    However, you are using an HTTP/1.1 feature. If the server you are connecting to is not HTTP/1.1 compliant it will not work as you wish.

    --
    perl -p -e "s/(?:\w);([st])/'\$1/mg"

      I assumed at least one of the 2 servers I tried were 1.1 compliant - now it looks like I've got to check,and that's a good thing. Thanks.