Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

LWP::Simple Getstore memory

by AI Cowboy (Beadle)
on Jun 05, 2013 at 02:50 UTC ( [id://1037096]=perlquestion: print w/replies, xml ) Need Help??

AI Cowboy has asked for the wisdom of the Perl Monks concerning the following question:

Greetings fellow monks...

My problem is fairly simple.

Is there a way to fix the issue of loading files into memory with LWP::Simple Getstore(), so that you can load larger files without worrying about crashing the computer and causing the universe to explode?

I looked at LWP-Download but it doesn't have any documentation I can see, I have no idea how to use it. The only other way I can figure out how to fix the issue is reading a file line by line and printing it to a new line on a new file on the local machine, but that would cause issues with encoding, wouldn't it?

As always, all help is appreciated!

EDIT: I have solved the problem with content_file, thanks to everyone for pointing out things I'd missed :)

Replies are listed 'Best First'.
Re: LWP::Simple Getstore memory
by aitap (Curate) on Jun 05, 2013 at 06:30 UTC
    Looking at LWP::Simple sources, it's already done:
    sub getstore ($$) { my($url, $file) = @_; my $request = HTTP::Request->new(GET => $url); my $response = $ua->request($request, $file); $response->code; }
    And $file ends up in LWP::Protocol collect() method:
    elsif (!ref($arg) && length($arg)) { open(my $fh, ">", $arg) or die "Can't write to '$arg': $!" +; binmode($fh); push(@{$response->{handlers}{response_data}}, { callback => sub { print $fh $_[3] or die "Can't write to '$arg': $!" +; 1; }, });
    Files are not loaded into memory. You may want to tweak :read_size_hint argument of get() method of LWP::UserAgent if it still overflows.
Re: LWP::Simple Getstore memory
by BrowserUk (Patriarch) on Jun 05, 2013 at 03:24 UTC
    I looked at LWP-Download but it doesn't have any documentation I can see,

    Weird. The page you linked *is* the documentation.

    It is a command line app that comes with Perl, but you could invoke it via system.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      D'oh - I didn't understand that. I thought it was a module under LWP, to use in Perl. This explains my confusion. Many thanks!
Re: LWP::Simple Getstore memory
by vsespb (Chaplain) on Jun 05, 2013 at 06:21 UTC
    LWP::UserAgent can download any huge files into memory. However I did not understand your problem. Could you please provide PoC code for:
    crashing the computer and causing the universe to explode
    line on a new file on the local machine, but that would cause issues with encoding
      Or maybe you want to download files directly to disk, without huge memory buffer.

      LWP::UserAgent content_file or content_cb will help then
        Yes, I want to download files directly to the disk, without a huge memory buffer. Content_file looks like it practically replaces the need for getstore, as you can simply set the content_file to the filename and store the file that way - without ever reading the file to memory. Am I missing something here, or is it essentially a getstore replacement?
Re: LWP::Simple Getstore memory
by Anonymous Monk on Jun 05, 2013 at 03:35 UTC
      I'm not sure what that thing is supposed to do, but it looks like it's designed to try and continue a download that is interrupted - what I am looking for, is something that does not store files in memory before writing them to the hard-drive, which is what getstore does, and is not good when you're not sure what files your tool is going to be downloading. If they're 10 GB, then you could crash the client computer, and that's no good.

        Um, getstore already does that?

        This part talks about what you're looking for -- I didn't confirm that getstore really does that, did you confirm it?

        I'm not sure what that thing is supposed to do, but it looks like it's designed to try and continue a download that is interrupted -

        Yes, its like that, its a demo of that

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1037096]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2024-04-24 04:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found