in reply to Re^2: Native newline encoding
in thread Native newline encoding
That version supports opening files in TEXT mode (similar to FTP) and there are two ways to do it, first one is to convert the native new-lines to CRLF before sending through the network and the second one is to tell the client what the native newline sequence is and let it handle the burden of the conversion.
Hm. My reading of the appropriate RFC is slightly different, in that the server can choose whether to send CRLF or a single char line ending of their choice:
4.3 Determining Server Newline Convention In order to correctly process text files in a cross platform compatible way, the newline convention must be converted from that +of the server to that of the client, or, during an upload, from that o +f the client to that of the server. Versions 3 and prior of this protocol made no provisions for processing text files. Many clients implemented some sort of conversion algorithm, but without either a 'canonical' on the wire format or knowledge of the servers newline convention, correct conversion was not always possible. Starting with Version 4, the SSH_FXF_TEXT file open flag (Section 6.3) makes it possible to request that the server translate a file +to a 'canonical' on the wire format. This format uses \r\n as the lin +e separator. Servers for systems using multiple newline characters (for example, Mac OS X or VMS) or systems using counted records, MUST translate t +o the canonical form. However, to ease the burden of implementation on servers that use a single, simple separator sequence, the following extension allows t +he canonical format to be changed. string "newline" string new-canonical-separator (usually "\r" or "\n" or "\r\n" +) All clients MUST support this extension. When processing text files, clients SHOULD NOT translate any character or sequence that is not an exact match of the servers newline separator. In particular, if the newline sequence being used is the canonical "\r\n" sequence, a lone \r or a lone \n SHOULD be written through without change.
And it is down to the clients to convert whatever the server sends to their required local form.
At this point, it seems to me that the simple solution is the first one letting Perl read the file in text mode and then applying s/\n/\r\n/. This may be slightly incorrect in some edge cases (for instance, files on Windows with \n line endings) that nobody would care about so I don't either!
I whole-heartedly agree, though I would approach that solution in a slightly different manner.
When TEXT mode is requested:
This way, whatever the local line separator is, it gets taken care of by Perl (or the CRT of you're using XS). And the data is transmitted with the required 'canonical newlines'.
Clients then do the same in reverse. Read from the socket line-by-line having set their INPUT_SEPARATOR to CRLF; chomp; and write line-by-line using the default OUTPUT_SEPARATOR for their local platform.
This way, the conversions are taken care of at both ends by perl or the CRT. At least, for ascii/ANSi/ISO-whatever-that-number-is files that have the 'correct' newlines on the originating platforms.
Things (will) get far more messy once the RFCs start dealing with Unicrap.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Native newline encoding
by sauoq (Abbot) on May 23, 2012 at 11:55 UTC | |
by BrowserUk (Patriarch) on May 23, 2012 at 15:04 UTC | |
by sauoq (Abbot) on May 23, 2012 at 21:38 UTC | |
by BrowserUk (Patriarch) on May 28, 2012 at 11:44 UTC | |
by ikegami (Patriarch) on May 29, 2012 at 23:58 UTC | |
| |
by sauoq (Abbot) on May 28, 2012 at 17:40 UTC | |
|