Using Less Memory for LWP/FTP

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I have a script that's using LWP to do very simple FTP gets and puts. The code for the FTP GET looks something like this:

#an FTP get
my $ua;
$ua = LWP::UserAgent->new();
$ua->agent("$0/0.1 ".$ua->agent);
my $req = HTTP::Request->new(GET => "$url");
my $result = $ua->request($req);
[download]

And the code for an FTP PUT looks something like this:

#an FTP put
#open the file and read the contents
my $content;
if (open DATA_READER, "$data_file") {
     $content = join ("", <DATA_READER>);
     close DATA_READER;
} else {
     warn "UNABLE TO READ $data_file, upload will be empty!  $!\n";
}

my $ua;
$ua = LWP::UserAgent->new();      
$ua->agent("$0/0.1 ".$ua->agent);
my $req = HTTP::Request->new('PUT',"$url",undef,"$content");
my $result = $ua->request($req);
[download]

The thing is, I really don't care about the file data itself--in fact, when I do the ftp get, I never actually write the file to disk; I'm just concerned about whether the file was transfered properly or not. Some of the files I have to fetch and put are rather large...around 100-250MB. With the above method, perl is taking up a huge chunk of memory to buffer the files. Can anyone suggest a way to reduce the memory footprint of this code? For instance, is there a way to feed the file to the FTP put without buffering it into memory first? Likewise, is there a way to flush the buffer in the FTP get as data is coming in? It would be best if I can do this with LWP instead of having to resort to using other modules.

Comment on Using Less Memory for LWP/FTP Select or Download Code

Replies are listed 'Best First'.
Re: Using Less Memory for LWP/FTP by Corion (Patriarch) on Dec 02, 2003 at 14:21 UTC
The LWP::UserAgent documentation tells me of two special parameters to `get()`: `:content_file => $filename :content_cb => \&callback` [download] The `:content_file` parameter is for directly saving the content of the response to a file, the `:content_cb` parameter is for passing the contents directly to a supplied callback. You should also take a look at the `mirror()` and `getstore()` methods, which might be suitable for your tasks as well. `perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web` [download]	[reply] [d/l] [select]
Re: Using Less Memory for LWP/FTP by iburrell (Chaplain) on Dec 02, 2003 at 21:27 UTC
Look at `mirror` and the second argument to `request` in LWP::UserAgent. The second argument controls where the response goes. If it is a scalar, it is used as a filename. If it is a code references, the callback is called with the blocks of data. For your application, your subroutine can validate and then throw away the dat. Also, look at LWP::Protocol::collect `my $ua = LWP::UserAgent->new(); my $request = HTTP::Request->new('PUT', $url, undef, $content"); my $response = $ua->request($request, \&check_response);` [download] With Net::FTP, you will have to read from the data socket yourself. The `retr` command will start the download and return the socket for the data connection. Your code can then read the data and throw it away. For uplaods, you will have to do things differently. But both LWP and Net::FTP will read from local files for uploads and don't need to hold the entire file in memory.	[reply] [d/l] [select]
Re: Re: Using Less Memory for LWP/FTP by Anonymous Monk on Dec 04, 2003 at 13:17 UTC
>But both LWP and Net::FTP will read from local >files for uploads and don't need to hold the >entire file in memory. How is this done? Do I need to pass a filehandle or a filename insto the request or something?	[reply]
Re: Using Less Memory for LWP/FTP by Art_XIV (Hermit) on Dec 02, 2003 at 14:14 UTC
I don't have any ideas about how to reduce your buffering problems with LWP. I'm curious about why you didn't/can't use Net::FTP, which is all about file-slinging, as opposed to LWP, which is mostly about content. Hanlon's Razor - "Never attribute to malice that which can be adequately explained by stupidity"	[reply]