mr. jaggers has asked for the wisdom of the Perl Monks concerning the following question:

I'm using Net::SFTP (which relies on the all-Perl Net::SSH::Perl module) to automate retrieval of backups, currently, on a Slackware 9.0 machine w/an AMD K6 III running at 530MHz. New/different hardware is not going to happen (except that I could possibly procure a dual cpu Pentium III 450 machine, w/only one cpu :).

At any rate, there ends up being about 4.5GB of backups, when all is said and done. Currently, I can run sftp from the command line (using the OpenSSH implementation of the client, written in C) to download all the backups very quickly. However, my automation script loads the processor (but not the memory) too heavily, I surmise because it is relying on the pure-Perl implementation of SSH, and takes no less than 18 hours to download. Worse, combined with the load of tar and bzip2 when the data goes to tape, the machine crawls to an unacceptable level of performance (I timed the tape job at a couple of hours, which jumps up to over 6 hours when tape & download jobs overlap). Badbadbad.

I really like the Net::SFTP API, it's clean, and saves me tons of code. I was able to build in a debug mode, keyed on a '-d' command line switch, and really don't want to change anything, but I have to. Currently, I'm trying to still use Net::SFTP to build the file hash from the ls() function return info, which is also a highly useful and well written feature. However, I'm forced to farm out the actual file transfers to a filehandle to a process and feeds the SFTP command to the local sftp via stdin.

Besides not entirely working and breaking all of my debugging/diagnostics code, there are problems with this implementation (not to mention that I don't like it). What would be ideal would be a wrapper-based implementation of Net::SFTP, like Net::SSH.

Anyone seen/been-forced-to-build it? Any tips on how I could modify the current Net::SFTP to do so? Has anyone been able to use LWP for this? I guess I'd be willing to switch if it has SFTP functionality.

Any ideas/tips that could rescue my code from rapidly kludging itself into a mire of ugly hackiness?

Replies are listed 'Best First'.
Re: Perl and SFTP
by iburrell (Chaplain) on Apr 29, 2004 at 01:16 UTC
    A wrapper based implementation of SFTP is unlikely. Net::SSH just calls the ssh command; there is no library interface. This would not work for SFTP which needs an API for all the operations.

    First thing to check is that you are using a C library for the encryption. This is the big bottleneck with bulk transfers. For example, Crypt::Blowfish is C/XS while Crypt::Blowfish_PP is pure Perl. Crypt::DES and Crypt::IDEA are C/XS. Also, try turning off compression is you have a fast transfer.

    Passing the files to an external sftp process may be faster though. To keep the complexity down, considering doing it in two steps. First, use Net::SFTP to make the list files. Then use sftp to transfer the files.

      This would not work for SFTP which needs an API for all the operations.

      That bites the big llama. The SFTP batch mode (using -b command line switch) will silently exit if an error occurs on any 'get' command in the batch script. If I don't use batch mode, than I have to do some bidirectionaly IPC (must have read also, or my code won't know *what* the heck is going on). I'm sure that doing so involves some of the problems I'm having with opening bidirectional communication with the sftp process I'm trying to work with. I've read chapter 6 of the camel book, and I'm trying to streamline the IPC (or rather, make it actually function correctly). It sucks. I can't even rely on process return values to know about premature exits. I'd have to put the whole thing in a big verification loop, build a local and destination file list, compare them, and smartly modify the download list, lather, rinse, repeat.

      I figured it would be similar to the ssh wrapper around a system binary. Even a Net::SFTP that used Net:SSH instead of Net::SSH::Perl would be just as good. I suppose I'm relegated to reading the code (it looks scary) to convince myself that I am, in fact, screwed.

      First thing to check is that you are using a C library for the encryption.

      Ok, I'll check it out; I was suspicious of something like that. However, I'm not sure how I can even ensure that. These are source builds of the perl modules, and I'm not sure that I can manhandle Net::SFTP into using Crypt library functions that weren't passed to it from Net::SSH::Perl.

      Compression is already off, as the files themselves are already compressed, and I didn't think I'd get much benefit from compression of packet headers, etc. If I could even get 50% increase in performance, than that would guarantee that the program finishes execution during the available time, although it still morally wrong to allow it 8 hours to download 4 gigs over a 100 base T connection.

      Passing the files to an external sftp process may be faster though. To keep the complexity down, considering doing it in two steps. First, use Net::SFTP to make the list files. Then use sftp to transfer the files.

      Yeah, that's what I was talking about earlier. I'm building the file hashes with the perl module, but still passing the commands to the system ruins my debugging/verification. I guess I'm up for a rewrite, but it's a short program anyway. Soon to be a longer program.

      ... sigh ...

      Thanks for the help, iburrell!