Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have always found a solution that works by searching this site and others, but this problem has been driving me crazy for a few days. I'm coding in Windows (Windows 10) and trying to change a Windows created text file that ends in CRLF to just LF, so that I can upload the file and process it on a Unix-based site. Every thing that I have tried either gives me CRLF back, or in some cases I can get the file to end with just CR, but never just LF.


I know that this script is lengthy, but the oneliners haven't worked, so I was attempting to pull in the existing file, substitute CRLF with LF, push the result onto an array, then re-open the original filename and dump the array. Here's what doesn't work ( I have an environement variable set which we can say rbfile=myfile.txt ):


use strict; use warnings; my $rxbfile = $ENV{rbfile}; open(IN, "$rxbfile") or die $!; binmode IN; my @array = ""; my $newline = ""; while (<IN>) { $newline = $_; $newline =~ s/\r\n/\012/; push @array,$newline; }; close (IN); open(OUT, ">$rxbfile") or die $!; foreach (@array) { print OUT $_; }; close (OUT);

Replies are listed 'Best First'.
Re: Changing CRLF in Windows to LF for UNIX
by choroba (Cardinal) on Oct 30, 2018 at 20:53 UTC
    The easiest way is to read the file line by line, printing each line after changing the newline. Using I/O layers makes the code even shorter:
    #! /usr/bin/perl use warnings; use strict; open my $IN, '<:raw:crlf', shift or die $!; open my $OUT, '>:raw', shift or die $!; print {$OUT} $_ while <$IN>;

    :raw turns off the CRLF automatic translation that is the default on MSWin (so the input doesn't really need the layers to be specified).

    Call as win2nix input.txt output.txt.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      I just found my problem, thanks in large part to your code. I had binmode attached to the initial (IN) file handle, but did not have it attached to the OUT handle. Adding "binmode OUT;" allows my original code to work. Thanks for the help!
      That worked, so thanks. But now I'm trying to figure out why my code didn't work earlier. I had binmode turned on, and I thought it defaulted to ":raw", and indeed that's what allowed me to get the Macintosh CR to show up solo at the end of each line. I did try to explicitly add ":raw" to the end, but that didn't help.
        Have you tried binmoding the OUT, too?

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

        You should use 3 arg open FILEHANDLE,MODE,EXPR. I tried out the code and it does work for me.

        >> I'm coding in Windows (Windows 10) and trying to change a Windows created text file that ends in CRLF to just LF, so that I can upload the file and process it on a Unix-based site.

        If you are running the script in Windows, it will happen since when you open for write (without binmode or :raw or :unix), Perl will perform '\n' to (Windows) platform '\r\n'.

Re: Changing CRLF in Windows to LF for UNIX
by stevieb (Canon) on Oct 30, 2018 at 20:55 UTC

    I wrote File::Edit::Portable for things like this, particularly when opening files on systems where you may not have any idea on what the endings are:

    use warnings; use strict; use File::Edit::Portable; my $rw = File::Edit::Portable->new; my $file = 'scrap.txt'; # show what endings the file has currently print $rw->recsep($file, 'hex') . "\n"; my @contents = $rw->read('scrap.txt'); # rewrite the file with the desired record separator $rw->write( contents => \@contents, recsep => "\n" ); # check to see if it took print $rw->recsep($file, 'hex') . "\n";

    Output:

    \0d\0a # before (\r\n) \0a # after (\n)

    If you've got vi/vim, simply open the file, then:

    :set ff=unix
Re: Changing CRLF in Windows to LF for UNIX
by Discipulus (Canon) on Oct 30, 2018 at 21:00 UTC
    Hello mabowden and welcome to the monastery and to the wonderful world of Perl!

    > one liners havent worked..

    strange: they always work! see The ultimate guide to Windows and Unix file line ending conversion in Perl by David Farrell

    (On windows) By default Perl changes the value of “\n” to CRLF. This m +eans that the regex match: “/\015\012/” will fail on Windows as Perl +is actually running: “/\015\015\012/“. Regexes using meta-characters +and hex codes (”/\r\n/” and “/\x0d\x0a/“) fail for the same reason. ... perl -pe "binmode(STDOUT);s/\R/\012/" /path/to/file > /path/to/new/fil +e

    L*

    PS see also newline gory details by afoken

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      I'm quite sure that it "worked" in the sense of doing what I told it to do, but didn't work in the sense that I didn't get the expected output ...
Re: Changing CRLF in Windows to LF for UNIX
by localshop (Monk) on Oct 30, 2018 at 21:07 UTC
    brew install dos2unix

    or yum or apt-get etc

Re: Changing CRLF in Windows to LF for UNIX
by BillKSmith (Monsignor) on Oct 31, 2018 at 14:52 UTC
    The following suggestion is not nearly as general as other solutions, but this should solve your problem. On your windows machine, by default, perl translates your input file into perl's internal representation. On output, you can specify the :unix io-layer to translate this internal representation into unix file format. You must not use binmode on either one, that would turn off the translation.
    >perl -pe"BEGIN{open STDOUT, '>:unix', 'unix_text.txt'}" win_text.txt
    Bill
Re: Changing CRLF in Windows to LF for UNIX
by mabowden (Novice) on Oct 30, 2018 at 20:39 UTC

    I actually made this post and thought I was logged in, but obviously got logged out