neomage has asked for the wisdom of the Perl Monks concerning the following question:

I want to be able to print just an "LF" to a file.

First, to clear up any ambiguity, I'm going to call the ASCII character with a hex code of 0A and a decimal code of 10 an "LF". I'm also going to call the ASCII character with a hex code of 0D and a decimal code of 15 a "CR".

As a simple test case:

open TEST, ">", "test.data";
select TEST;
print "A" . chr(hex("0A")) . "B" . "\n" . "C" . "\r" . "D";
close TEST;

Logically, when I look at the test.data file with a hex editor I'd expect just a "LF" between the A and B, but instead there are is a "CR" followed by an "LF."

Here is what a hex editor shows is in "test.data" (first line is hex, second line are chars):

41 0D 0A 42 0D 0A 43 0D 44
A CR LF B CR LF C CR D

I was expecting only a "LF" (i.e., 0A) between the "A" (41) and "B" (42) ...

If it helps this is Perl 5.8.8 (ActiveState Perl Build 820) on Windows XP.

Thanks for any advice you may be able to give me.

Replies are listed 'Best First'.
Re: Trouble with newlines
by jettero (Monsignor) on May 12, 2007 at 10:04 UTC

    See: binmode

    Interestingly, moron, tye, and various others were talking about the nuances of $/ and perl file handles on the CB the other day. Apparently a "\n" can be multiple characters on the way out (or represent multiple on the way in) where appropriate on different platforms.

    And It would seem that this translation happens between perl and the OS. That is, $/ is "\n" on windows. I was surprised by that — "fooled" as tye put it.

    -Paul

      And It would seem that this translation happens between perl and the OS. That is, $/ is "\n" on windows.

      Provided you have the file open in text mode, that is :-) update: that's when the translation happens, I mean. $/ doesn't change when you switch to binary mode.

      In any case this behaviour (probably not coincidentally) is exactly the same for C:

      The C programming language provides the escape sequences '\n' (newline) and '\r' (carriage return). However, contrary to popular belief, these are in fact not required to be equivalent to the ASCII LF and CR control characters. The C standard only guarantees two things:

      1. Each of these escape sequences maps to a unique implementation-defined number that can be stored in a single char value.

      2. When writing a file in text mode, '\n' is transparently translated to the native newline sequence used by the system, which may be longer than one character. (Note that a C implementation is allowed not to store newline characters in files. For example, the lines of a text file could be stored as rows of a SQL table or as fixed-length records.) When reading in text mode, the native newline sequence is translated back to '\n'. In binary mode, the second mode of I/O supported by the C library, no translation is performed, and the internal representation of any escape sequence is output directly.

      Apparently a "\n" can be multiple characters on the way out
      It's actually documented. The perlport says something about this newline:
      In most operating systems, lines in files are terminated by newlines. Just what is used as a newline may vary from OS to OS. Unix traditionally uses "\012", one type of DOSish I/O uses "\015\012", and Mac OS uses "\015".

      Perl uses "\n" to represent the "logical" newline, where what is logical may depend on the platform in use. In MacPerl, "\n" always means "\015". In DOSish perls, "\n" usually means "\012", but when accessing a file in "text" mode, STDIO translates it to (or from) "\015\012", depending on whether you're reading or writing. Unix does the same thing on ttys in canonical mode. "\015\012" is commonly referred to as CRLF.


      Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

        The line endings are extremely annoying if one is on a dual boot box (Win/*nix) and trys to tweak CGI supposed to run on Apache/*nix as well as on IIS/Win.

        Whenever one writes textmode *\n* to a file on booted Win and later boots the other OS reading the same file on *nix it ends up to be \r\n.
        Extremely annoying imho. Sometimes one forgets about it and then wonders why things don't work as expected .

Re: Trouble with newlines
by shmem (Chancellor) on May 12, 2007 at 10:03 UTC
    Retry with binmode TEST.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}