Murcia has asked for the wisdom of the Perl Monks concerning the following question:

Hi Confreres, I want your expertise in "how to upload many files from many dir" as quick as possible?"

Problem:

I want to download thousands of files from a server. These files are distributed in same dirs.

Now I do it with a system call (wget) uploading all files in each dir one after another.

my @dir = (lmo lin lmf bsu ssu sst ); # and more dir names foreach my $dir(@dir){ my $link = "ftp://ftp.<Path_to_dir>.jp/$dir/*"; system("wget -nH -nd --timestamping $link"); }
that takes time!

Should I make a fork? Makes this sense? And IF how? Internet bandwidth is not a problem!

Thanks Murcia

Replies are listed 'Best First'.
Re: file upload
by halley (Prior) on Sep 15, 2005 at 14:32 UTC
    Your code won't compile as typed.

    You could fork or provide system() arguments which will put each submitted command into the background, allowing the next iteration to begin immediately. However, this will mean that you're creating many concurrent login sessions to the same FTP server, which may fail if the FTP server has policies against such usage. You may also want to limit the number of concurrent sessions for other reasons, even if "bandwidth isn't a problem."

    --
    [ e d @ h a l l e y . c c ]

fork me? fork queue!
by LanceDeeply (Chaplain) on Sep 15, 2005 at 17:03 UTC
    my %children; my $max_children = 3; my @todo_list; foreach my $dir(qw/lmo lin lmf bsu ssu sst/) { fork_download($dir); } complete_downloads(); sub fork_download { my $dir = shift; if ( ( scalar keys %children ) < $max_children ) { my $pid = fork; if ( $pid ) { # this is parent process $children{$pid} = $dir; } else { if ( not defined $pid ) { die "failed to fork!\n"; } # this is child process my $link = "ftp://ftp.<Path_to_dir>.jp/$dir/*"; system("wget -nH -nd --timestamping $link"); # exit child process exit 0; } } else { # too much child labor! queue for later push @todo_list, $dir; } } sub complete_downloads { while (%children) { my $childpid = wait(); print "child [$childpid][" . $children{$childpid}. "] is done\ +n"; delete $$children{$childpid}; if ( @todo_list ) { my $nextDir= shift @todo_list; fork_download($nextDir); } } }
    -HTH

    Update: forgot to exit child process!
Re: file upload
by newroz (Monk) on Sep 15, 2005 at 15:03 UTC
    It is not clear whether you're uploading or downloading.
    In either case, look Net::FTP. It will save your program from system calls and
    you'll be able to create one session for multiple files