camerlengo has asked for the wisdom of the Perl Monks concerning the following question:

I'm searching for an ideal method for transferring large amounts of data from one unix server to another. I'm thinking of using an implementation that uses a "server" to feed groups of files to several "clients". The clients would transfer their file(s) (ftp, scp, whatever) and then receive another chunk of files to process, until all files have been transferred. I've googled and searched through PM for clues as to how best to implement this using perl. It seems that the best options are threads and POE. Anyone blazed this trail already? Anyone have any implementation recommendations? Anyone have any file transfer protocol preferences?

Replies are listed 'Best First'.
Re: Multi Stream File Transfer
by dragonchild (Archbishop) on Jul 14, 2004 at 17:56 UTC
    You're looking for a Perl solution to a non-Perl problem. Look at BitTorrent for a perfect solution for this problem.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

Re: Multi Stream File Transfer
by waswas-fng (Curate) on Jul 14, 2004 at 18:49 UTC
    If you are just talking about pushing a bunch o files from one server to many clients take a look at rsync. It will send the files to the remote servers, then as the files update on the master server only send the changed files (even better yet, only the portion of the file that is different). This saves a lot of bandwidth.


    -Waswas
Re: Multi Stream File Transfer
by valdez (Monsignor) on Jul 14, 2004 at 21:57 UTC

    I would use a... Flamethrower :) From Freshmeat:

    Flamethrower is a multicast file distribution system. It was originally created to add multicast install capabilities to SystemImager, but is designed as a stand-alone package. It works with entire directory heirarchies, rather than single files. It uses a server configuration file, which takes module entries similar to rsyncd.conf. It is an on-demand system; multicast of a module is initiated when a client connects, but it waits a predetermined period for other clients to connect before beginning.
    Oh, and it's part of the Perl Foundry!

    Ciao, Valerio

Re: Multi Stream File Transfer
by hardburn (Abbot) on Jul 14, 2004 at 18:02 UTC

    Doing a scalable, practical P2P file transfer is very non-trivial, even from a purely technical point of view (not touching the ugly politics of it all). The Gnutella people screwed it up for years. Further, most of the scalable P2P apps that are out there have been re-implementing each others' ideas and calling them different things (look particularly at what Freenet calls "CHK"--similar ideas are used in almost every P2P app around, except the simple-stupid ones). So don't add to the mess. As hard as it is, many people have already created good solutions, and you'll do well to find them.

    ----
    send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.