in reply to Re: Parsing a text file without newlines
in thread Parsing a text file without newlines

You've got that reversed. FTP's text mode (usually the default or enabled with a command ascii) converts line endings to platform native; binary preserves the contents of the file verbatim.

  • Comment on Re^2: Parsing a text file without newlines

Replies are listed 'Best First'.
Re^3: Parsing a text file without newlines
by bart (Canon) on Dec 14, 2004 at 13:49 UTC
    The thing is, uploading a file in text mode from say Windows to Linux, will simply strip all CR characters, whether there are LF character present, or not. All you have left, is one, long line.

      That seems like broken behavior on either the client or ftpd's part then. Quoting from the FTP RFC:

      3.1.1.1. ASCII TYPE This is the default type and must be accepted by all FTP implementations. It is intended primarily for the transfe +r of text files, except when both hosts would find the EBCDI +C type more convenient. The sender converts the data from an internal character representation to the standard 8-bit NVT-ASCII representation (see the Telnet specification). The receiv +er will convert the data from the standard form to his own internal form. In accordance with the NVT standard, the <CRLF> sequence should be used where necessary to denote the end of a line of text. (See the discussion of file structure at the end of the Section on Data Representation and Storage.) Using the standard NVT-ASCII representation means that dat +a must be interpreted as 8-bit bytes. The Format parameter for ASCII and EBCDIC types is discuss +ed below.

      As that reads, the protocol says in ascii mode you convert to the network line ending (CRLF) when sending and from CRLF to native on receipt. If your client or server's just stripping CR blindly it's not living up to the spec.

      I did just now test this with the stock NT 2000 ftp.exe and vsftpd that I had installed on RH9 and it did just strip CRs, so the original post was correct about how things work (I still say it's broken though :).