Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Line Ending Converter

by kayos (Sexton)
on Apr 25, 2000 at 19:50 UTC ( [id://8991]=sourcecode: print w/replies, xml ) Need Help??
Category: data formatting
Author/Contact Info T.R. Fullhart, kayos@kayos.org
Description:

This converts the line-endings of a text file (with unknown line-endings). It supports DOS-type, Unix-type, and Mac-type. It converts the files "in place", so be careful.

You call it like:

linendings --unix file1.txt file2.txt ...
#!/usr/bin/perl

my $lineending = "\n";

my $type = shift @ARGV;

if( $type =~ /unix/ ) {
        $lineending = "\012";
} elsif( $type =~ /dos/ ) {
        $lineending = "\015\012";
} elsif( $type =~ /mac/ ) {
        $lineending = "\015";
} else {
        print "Usage: $0 --unix|--dos|--mac\n";
        exit 1;
}

my @files = @ARGV;

for my $file ( @files ) {
        open FILE, $file or next;        # thanks turnstep
        my @lines = <FILE>;
        close FILE;

        foreach my $i ( 0..$#lines ) {
                $lines[$i] =~ s/(\012|\015\012?)/$lineending/g;
        }

        open FILE,">$file";
        print FILE @lines;
        close FILE;
}
Replies are listed 'Best First'.
Re: Line Ending Converter
by planetscape (Chancellor) on May 01, 2005 at 08:21 UTC
RE: Line Ending Converter
by turnstep (Parson) on Apr 25, 2000 at 20:58 UTC
    You should check the value of the first open and abort that file if it fails:
    open FILE, $file or next;
    Otherwise, if the first open fails and the second is successful, the file will be emptied out (i.e. erased). Ouch!
      thanks turnstep, I could have nuked my files!!
RE: Line Ending Converter
by KM (Priest) on May 20, 2000 at 04:27 UTC
    A possible problem here is that you are only converting \015\012 and \012\015, which works for UNIX->Mac|Win32, but not the other way around. Just a minor nit.

    You can also do this from the command line:

    perl -pi -e 's![\012\015]{1,2}!$/!g' file
    Of course, using $/ means you need to be on the desired final OS, but change that to whatever you want it to be.
      The code:

      perl -pi -e 's![\012\015]{1,2}!$/!g' file

      will not work right for mac->unix, mac->dos or unix->mac, because two consecutive \015 or \012 chars will be converted into single ones. So if you have blank lines, they'll get eaten. A safer alternative (though perhaps not an optimal RE) is:

      perl -pi -e 's!(?:\015\012?|\012)!$/!g' file

      update: added safer alternative, made RE 17% faster

      I get confused. Is a return in a DOS text file \012\015 or \015\012? T.R. Fullhart, kayos@kayos.org
        I use a little mnemonic device to remember this: "DOS is DOA". Well, actually, 0D0A, which converted to octal, is obviously \015\012.

        I use another mnemonic for \r\n. Because \n\r is not right :-)

           MeowChow                                   
                       s aamecha.s a..a\u$&owag.print
RE: Line Ending Converter
by mdillon (Priest) on May 26, 2000 at 19:52 UTC
    i like the following regexp for matching newlines: m/(\n|\r\n?)/
Re: Line Ending Converter
by selimnairb (Initiate) on Nov 25, 2002 at 19:27 UTC
    I needed something to convert arbitrarily between dos, mac and unix line endings, so I modified the original script. It could stand some improving (e.g. I broke being able to act on multiple files), and further testing, but it works for my purposes so here it is (hope it doesn't get too garbled):
    #!/usr/bin/perl # Based on: http://perlmonks.thepen.com/8991.html # # Extended to handle to->from conversion of the following # platforms: dos, mac, unix # # 20021125 miles@cmu.edu # use strict; use Getopt::Long; use File::Copy; my ($optFrom, $optTo, $optInFile, $optBackup); GetOptions('from=s' => \$optFrom, 'to=s' => \$optTo, 'infile=s' => \$optInFile, 'backup' => \$optBackup); my $usage = "Usage: $0 --from <platform> --to <platform> --infile <pat +h_to_file> [--backup]\n\nConverts line endings of text files from a g +iven platform to line endings\nof a given platform. Allowable plafor +ms are:\n\tdos (CRLF)\n\tmac (CR)\n\tunix (LF)\nFrom and to platforms + must be different\n\nOptional backup switch causes infile to be save +d as another file with '.bak'\nappended to its name before the new fi +le is written.\n"; if (!defined($optFrom) || !defined($optTo) || !defined($optInFile)) { print "$usage\n"; exit 1; } if (!defined($optBackup)) { $optBackup = 0; } my ($lineEndingFrom, $lineEndingTo); if ($optFrom eq $optTo) { print "$usage\n"; exit 1; } if ($optFrom eq 'unix') { $lineEndingFrom = "\012"; } elsif ($optFrom eq 'dos') { $lineEndingFrom = "\015\012"; } elsif ($optFrom eq 'mac' ) { $lineEndingFrom = "\015"; } else { print "$usage\n"; exit 1; } if ($optTo eq 'unix') { $lineEndingTo = "\012"; } elsif ($optTo eq 'dos') { $lineEndingTo = "\015\012"; } elsif ($optTo eq 'mac' ) { $lineEndingTo = "\015"; } else { print "$usage\n"; exit 1; } my @files; push(@files, $optInFile); for my $file ( @files ) { open FILE, $file or next; # thanks turnstep my @lines = <FILE>; close FILE; foreach my $i ( 0..$#lines ) { $lines[$i] =~ s/$lineEndingFrom/$lineEndingTo/g; } if ($optBackup) { copy($file, "$file.bak"); } open FILE,">$file"; print FILE @lines; close FILE; } # EOF
Re: Line Ending Converter
by William G. Davis (Friar) on Nov 22, 2003 at 15:48 UTC

    Don't forgot to use binmode. Some operating systems will play tricks with the line endings when you write to a file unless they're told explicitly not to.

    Here's the converter I use. It converts to either CR, CRLF, or LF depending on what you specify to -n, and can modify a single file, multiple files, or all the files in a directory using File::Find:

    #!/usr/bin/perl -w use strict; use Getopt::Std; use File::Find; # get the options: my %opts; getopts('f:n:h', \%opts) || usage(); usage() if (!$opts{'n'} || $opts{'h'}); # if no files were specified, we'll convert everything in the current +directory: push(@ARGV, '.') unless @ARGV; my $newline = $opts{'n'}; usage() if ($newline =~ /[^CRLF]/i); $newline =~ s/CR/\015/i; $newline =~ s/C/\015/i; $newline =~ s/R/\015/i; $newline =~ s/LF/\012/i; $newline =~ s/L/\012/i; $newline =~ s/F/\012/i; foreach my $filename (@ARGV) { # traverse the directory tree and look at each file: find(sub { convertNewlines() }, $filename); } sub convertNewlines { my $filename = $_; # don't mess with it unless it's a text file: return unless (-T $filename); open(FILE, "< $filename") or die "Couldn't open file ($filename) for reading: $!"; my $converted_text; my $line_endings_converted = 0; while (my $line = <FILE>) { $line_endings_converted += ($line =~ s/(?:\015\012|\015|\012)/$newline/g); $converted_text .= $line; } # now save it, and binmode it so no additional conversion is done +to # the line endings: open(FILE, "> $filename") or die "Couldn't open file ($filename) for writing: $!"; binmode FILE; print FILE $converted_text; close FILE; print "Converted $line_endings_converted newlines in \"$filename\" + " . "to $opts{'n'}.\n"; } sub usage { print <<'END_OF_USAGE'; This script can be used to convert the line endings in files to Unix, +Windows, or MacOS line endings. Usage: $ newlines -n NEWLINE [FILENAMES...] Arguments: -n The newline sequence that the line endings in the files you specified should be converted to. Either "CR" or "R" for carr +iage return, "LF" or "L" for linefeed, or "CRLF" for carriage return/linefeed. Flags: -h Displays this message. END_OF_USAGE exit; }

    For example, this:

    newlines -n CRLF foo.txt bar.txt foo_bar.txt stuff.* ./more_text_files

    converts the line endings in "foo.txt", "bar.txt", "foo_bar.txt", the text files in the current directory named "stuff" with any extension, and all of the text files in "./more_text_files" to CRLF (\015\012).

Re: Line Ending Converter
by Anonymous Monk on Jun 04, 2001 at 12:08 UTC
    yup !! thats what i wanted, thanks.. sachin.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://8991]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-03-28 16:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found