in reply to Mixing sysread() with <FILEHANDLE>?

This sounds a lot like you're trying to re-invent FTP (SFTP to be more precise). If using FTP isn't workable for your scenario, you could take a hint from how it opens up a separate session to transmit files.

Passive mode FTP is a good model... one side or the other opens up a socket on a random port, then tells the other side what it is. Connect for one blob, and have no worries about misinterpreting data.


sas
  • Comment on Re: Mixing sysread() with <FILEHANDLE>?

Replies are listed 'Best First'.
Re^2: Mixing sysread() with <FILEHANDLE>?
by pc88mxer (Vicar) on May 26, 2008 at 23:33 UTC
    I would definitely not use the FTP model. It just brings in too many operational issues (firewalling, security, etc.) Moreover, you have to make a separate TCP connection for each 'blob', and that will easily eliminate any speed-up gained by not having to encode the data.

      Security and firewalls are a concern for all TCP/IP communication; FTP isn't secure, but that's not relevant here. Given the use of SSH, SFTP could be used, which should add no security concerns.

      As for using FTP as a model, the key point is having a different session for the binary data. As for performance, I don't think you can assume that a new session per blob is going to be a major factor. It depends on the average size of those blobs. Anyway, as the original poster noted, a single separate data session would likely serve his purpose.


      sas
Re^2: Mixing sysread() with <FILEHANDLE>?
by wanna_code_perl (Friar) on May 26, 2008 at 23:26 UTC

    Truly, I don't want to re-invent FTP. The key here is that the only channel I can rely on is SSH, and it is going to be way too much overhead to open up separate data channels for each binary object.

    Objects vary in size from a few bytes to about 1GB, and there could easily be thousands in a single session.

    I guess I could open up one extra data channel when the original connection is initiated, and then use syswrite/read on that, approximately like this:

    Control channel:

    C: STORE Name=<name> Content-Length=<length> MD5=<hash> S: OK, go ahead

    Then the client transmits the object in a series of syswrite() calls to the separate SSH. The separate process on the server would do sysread() only.

    However, I would prefer to in-line it in one channel, to avoid the complexity of the extra connection and extra processes/threads. HTTP transmits a mixture of text and binary data pretty readily through a single socket.