Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Re: Unable to complete download with Net::FTP

by Itatsumaki (Friar)
on Nov 29, 2003 at 06:45 UTC ( [id://310838]=note: print w/replies, xml ) Need Help??


in reply to Re: Unable to complete download with Net::FTP
in thread Unable to complete download with Net::FTP

Given that I hadn't noticed the getstore function, I found the FTP-based implementation above to be much superior. Three reasons why:

  1. It's shorter code: fewer lines and characters
  2. It's clearer code: it avoids the intermediation of a variable into the download process
  3. It's faster code: I noticed this empirically, and I imagine the difference there is from saving everything into one 30MB variable.

I guess you think the LWP version is clearer to read? After I get some sleep I'll benchmark the three approaches (Net::FTP, LWP::Simple::get(), and LWP::Simple::getstore()) and see what shakes out there.

-Tats

Replies are listed 'Best First'.
Re3: Unable to complete download with Net::FTP
by davis (Vicar) on Nov 29, 2003 at 13:04 UTC

    After I get some sleep I'll benchmark the three approaches
    You could certainly do that, but I believe you would be better off continuing with the method you find easiest, clearest, and most suitable. The difference in execution time between the various methods is likely (almost guaranteed) to be negligible, whereas using a method/module you find unintuitive will slow you down.
    Your time is worth more than a few seconds of processor execution time.


    davis
    It's not easy to juggle a pregnant wife and a troubled child, but somehow I managed to fit in eight hours of TV a day.
    Update: Minor text edit; title change

      I definitely agree, it's more a question of interest. At the same time, some of the files I download this way get up into the GB range. Downloading four or five of those in a night can eat up a lot of resources, so if there are easier/better ways to do it it's worth trying out in this case.

Re: Re: Re: Unable to complete download with Net::FTP
by Itatsumaki (Friar) on Dec 01, 2003 at 07:35 UTC

    Here's the benchmark. I'd love some help interpreting it, because I don't know what to make of this. Visually, using an LWP get() used up the most memory, but I can't grok the huge difference in wall-clock time. Incidentally, to avoid spamming my favourite genomic-annotation provider I tested a much smaller file (about 10k). I don't think I could really run a a test with more than 10 iterations on any of the bigger files, so if FTP has a long connect lag at the front, a larger file might make it more competitive.

    The code:

    use strict; use Benchmark; use Net::FTP; use LWP::Simple; sub lwp_simple { my $data = get('ftp://ftp.ncbi.nih.gov/refseq/LocusLink/LL.out_x +l.gz'); my $outfile = '>GO_TERMS.CSV'; if (!$data) { } open(OUT, '>LL_tmpl.gz'); binmode OUT; print OUT $data; close(OUT); sleep 1; } sub net_ftp { my $ftp; if (!($ftp = Net::FTP->new('ftp.ncbi.nih.gov', Debug=>0))) { print "Couldn't log-in"; return; }; $ftp->login('anonymous', 'anon@anon.com'); $ftp->cwd('/refseq/LocusLink/'); $ftp->type('binary'); $ftp->get('LL.out_xl.gz'); $ftp->quit(); sleep 1; } sub lwp_getstore { my $url = 'ftp://ftp.ncbi.nih.gov/refseq/LocusLink/LL.out_xl.gz'; my $file = 'LL.out_xl.gz'; getstore($url, $file); sleep 1; } timethese(100, { 'LWP' => \&lwp_simple, 'FTP' => \&net_ftp, 'LWP-Store' => \&lwp_getstore } );

    The results:

    Benchmark: timing 100 iterations of FTP, LWP, LWP-Store... FTP: 4011 wallclock secs ( 2.31 usr + 2.68 sys = 5.00 CPU) @ 20.01/s (n=100) LWP: 933 wallclock secs ( 4.05 usr + 4.87 sys = 8.92 CPU) @ 11.21/s (n=100) LWP-Store: 340 wallclock secs ( 4.11 usr + 3.70 sys = 7.81 CPU) @ 12.80/s (n=100)
    -Tats

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://310838]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (6)
As of 2024-03-28 15:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found