brettasterling has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I've searched extensively trying to find out how the default "Input Record Separator" ($/) is set, but all I'm finding is information on how it's used or how to set it myself. Based on the information I've found so far, $/ is supposed to be set appropriately for the OS, such as LF for UNIX type systems and CRLF for DOS/Windows type systems.

I'm currently running Perl on a Windows based system. As of last year, $/ was being set correctly to CRLF. However, something in the past year has changed such that $/ is now being set to LF. I'm hoping someone understands how this (important) variable is (supposed to be) automatically set.

Thanks!
  • Comment on How is the default "Input Record Separator" set?

Replies are listed 'Best First'.
Re: How is the default "Input Record Separator" set?
by haukex (Archbishop) on Mar 15, 2021 at 17:45 UTC
    $/ is supposed to be set appropriately for the OS, such as LF for UNIX type systems and CRLF for DOS/Windows type systems

    $/ is "\n" on Windows and on *NIX (I believe there may have been some ancient builds on Windows that used to do this differently). On Windows, the default PerlIO layers include the :crlf layer, which converts CRLF to \n on input and back on output. The :crlf layer is disabled by binmode or the pseudolayer :raw. For more details, see Newlines in perlport.

    If you need specific help with a specific issue, please post an SSCCE that reproduces the issue, and a hex dump of the input file. See also my nodes here and here for a bit more information on how to gather and post information that will be useful to us in helping you.

    Update: The documentation of $/ begins with "The input record separator, newline by default.", and in regards to the term "newline", see perlglossary (though the "gets automatically translated by your C library" bit is slightly outdated) and the aforementioned perlport. Note that "Mac" generally refers to Classic Mac, since Mac OS X (Darwin) is a *NIX OS.

      $/ is "\n" on Windows and on *NIX (I believe there may have been some ancient builds on Windows that used to do this differently).

      It would have been before 5.6 if so.

      $/ was even "\n" on ancient MacOS which used CR as the line ending, though \n meant CR (0x0D) on such systems.

      Seeking work! You can reach me at ikegami@adaelis.com

Re: How is the default "Input Record Separator" set?
by BillKSmith (Monsignor) on Mar 15, 2021 at 19:27 UTC
    The variable $/ deals with records, the layer :crlf deals with lines. We tend to think of lines and records as being the same thing, but they do not have to be. I have never changed $/ for any reason except to make <> read a multi-line block (record) of text rather than a line. I expect the :crlf layer to translate windows line-separators (crlf), within the record, to perl line-separators (\n). Multi-line regexes used to parse the block work as expected.
    Bill

      Nonesense. Record is synonymous with line here. It specifically controls what readline considers a line. (<> is short for readline().)

      Seeking work! You can reach me at ikegami@adaelis.com

        You do have a point, but I still think it makes far more sense to refer to the text between record_separators as 'records' and the text between newlines as 'lines' rather than overloading the word 'line' for both. I suspect that this is why the special variable '$/' is called '$INPUT_RECORD_SEPARATOR'. In this view, it is the name of the function 'readline' that is misleading.
        Bill
Re: How is the default "Input Record Separator" set?
by brettasterling (Initiate) on Mar 15, 2021 at 18:49 UTC

    Thanks for the responses - and I did do extensive searching (i.e. that's what I found when searching - that perl should never see the CR). However, I'm seeing something different (running as "perl cmds.pl" in a Command Prompt on Windows 10)

    File 'cmds.pl':

    open my $makefileToProcess, "<", "input_file.txt" or die "input_file.t +xt: $!"; while (my $line = <$makefileToProcess>) { print "LINE BEFORE:$line:\n"; chomp $line; print "LINE AFTER:$line:\n"; }
    And what I see is:
    LINE BEFORE:xyzCRLF: LINE AFTER:xyzCR:

    (NOTE that CRLF is the CRLF pair, not actual 'C''R'L'F'. Same with CR.)

    What's even more interesting is if, instead of running as "perl cmds.pl", I use the Windows ".pl" association, and run simply as "perl_cmds.pl", it works as you described (i.e. the CR is never seen).

    In this case, I have 2 different perl implementations installed, one being cygwin (perl version 5.26, which is the one that keeps the CR) and the other being a standalone perl (perl v5.8.8, which is the one that eliminates the CR).

    This is why I originally asked how the default IRS gets set, but perhaps the real question is why one perl on a Windows system is allowing the CR to pass through to the perl script.

      In this case, I have 2 different perl implementations installed, one being cygwin (perl version 5.26, which is the one that keeps the CR) and the other being a standalone perl (perl v5.8.8, which is the one that eliminates the CR).

      It's been a while since I worked with cygwin, but I assume that it's acting like a *NIX system would and not loading the :crlf layer by default - try perl -le "print for PerlIO::get_layers(*STDIN)" at the command line to check.

      Note that you can be really explicit about the fact that you want the :crlf layer to be loaded: try and see if open my $makefileToProcess, "<:raw:crlf", ... makes it work on both Perls.

      Also note that 5.8.8 is now over 15 years old. You probably want to consider upgrading, see Strawberry Perl.

Re: How is the default "Input Record Separator" set?
by Marshall (Canon) on Mar 15, 2021 at 17:54 UTC
    I am also a Windows user.
    The text I/O layers will take out <CR> and what Perl sees is just <LF>.
    So \n for input is just one character, even on Windows.
    For output, Windows will emit <CR><LF> for an \n. On Unix, just a <LF>.

    I am curious as to why you think that something has changed?
    Some code would be very helpful!

    Update: Haukex replied while I was writing this. I agree with what he said. Unless you are using binmode, you will never see a <CR>. I am not even sure that you can even set $/ to CRLF. Please show some code where you think that you did that.

      We've been over this at length: Test before posting.

      Unless you are using binmode, you will never see a <CR>. I am not even sure that you can even set $/ to CRLF. Please show some code where you think that you did that.
      use warnings; use strict; use Data::Dumper; $Data::Dumper::Useqq=1; open my $fh, '>:raw', 'test.txt' or die $!; print $fh "x\ry\r\n"; close $fh; open $fh, '<', 'test.txt' or die $!; my $in = <$fh>; close $fh; print Dumper($in); # "x\ry\n" open $fh, '<:raw', 'test.txt' or die $!; local $/ = "\r\n"; chomp( $in = <$fh> ); close $fh; print Dumper($in); # "x\ry"

      Which also means that you meant with "The text I/O layers will take out <CR> and what Perl sees is just <LF>." is wrong, the :crlf layer doesn't just strip all CRs.

        Hi Haukex,

        Ok, You are correct in that I did not consider the possibility of an embedded \r.
        "Unless you are using binmode, you will never see a <CR>". should be:
        "Unless you are using binmode, you will never see a <CR> as part of the line ending".

        Your code shows this: "x\ry\r\n" becomes ""x\ry\n". The <CR> associated with the line ending is taken out. This is not a global deletion of just any <CR>. It is a modification of the line ending sequence, just as I said albeit not as perfectly qualified as it could have been.

        From reading further info from the OP, it appears that this some kind of strange cygwin issue. I used to have, but no longer have a cygwin installation. cygwin is a weird neither beast nor fowl thing. Once I found out that I couldn't really run SW that used Unix specific features, I got rid of that thing in preference to simply using windows ports of some of the Unix command line utilities.

        PS: I did not know if $/ could be set to "\r\n" or not and I said so. I did not give either a "yes" or a "no" answer. You show that it can indeed be set to that. Great to hear that additional info!