Can anyone figure how this works?

thesundayman has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Can anyone figure how this works? by chromatic (Archbishop) on Sep 29, 2001 at 22:40 UTC
It opens up an input file (first command line argument) and an output file (second command line argument) for writing. Next, it loops through the lines of the input file. The pattern match just verifies that the line is only a newline character. If so, it doesn't print it to the output file, it prints the standard DOSish \r\n combination. Otherwise, it prints the line verbatim. Yeah, the explanation's longer than the code. Update: Let's further compact tachyon's replacement into a one-liner: `perl -pi.bak -e "s/^\n$/\r\n/" <filename>` ©	[reply] [d/l]
Re: Can anyone figure how this works? by tachyon (Chancellor) on Sep 29, 2001 at 22:51 UTC
Presumably you call this script like this: $ fix.pl wrong.data fixed.data Here is a blow by blow: # command line arguments are available to the script in the # @ARGV array. Thus the first argument is in $ARGV[0], the # second in $ARGV[1].... # Open the file specified in the first command line arg for reading open(IN, "$ARGV[0]") \|\| die "unable to open $ARGV[0]"; # Open the file specified in the second command line arg for writing open(OUT, ">$ARGV[1]") \|\| die "unable to open $ARGV[0]"; # stop perl making automatic \r\n => \n or \r => \n line ending # conversions which are required on Win32 and Mac respectively binmode(OUT); #set output mode as binary # now iterate over our input file on line at a time while(<IN>) { # if we have a line that contains only "\n" - ie a blank line if(/^\n$/) { # then we print "\r\n" instead of the existing "\n" into our o +utput file print OUT "\r\n"; } # otherwise just print out the totally unaltered line else { print OUT; } # else just print line with a CR } # close the input and output files close(IN); close(OUT); [download] If you want a short way to do the same this will do it with an inplace edit. You call it like this `fix.pl data` The data in `data` will get munged and a backup will be made called `data.bak` The backup will contain the original data, the argument file the modified data. `#!/usr/bin/perl -i.bak -w while (<>) { s/^\n$/\r\n/; print; }` [download] cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l] [select]
Re: Re: Can anyone figure how this works? by demerphq (Chancellor) on Sep 30, 2001 at 05:38 UTC
My only question is with regard to inplace edit. How does it function with regard to Binmode? Or has it been left out as it appears unneeded? Slightly confused, :-) Yves -- You are not ready to use symrefs unless you already know why they are bad. -- tadmc (CLPM)	[reply]
Re: Re: Re: Can anyone figure how this works? by tachyon (Chancellor) on Sep 30, 2001 at 22:10 UTC
In unix the default line ending is a line feed LF (\n or \012 or 0xA). In DOSWin it is carriage return line feed CRLF (\r\n or \015\012 or 0xA0xD). On a Mac it is CR (\r or \015 0xD) When you read and write text files perl will automatically convert from its internal use of \012 for the line ending to whatever the system is using. On unix this means it does nothing, but on Mac and Windows conversions are made. Binmode tells perl not to convert line endings when reading or writing. In other words it does a raw read/write. On unix binmode has no effect as there is no conversion to make. On other systems the results are quite predictable. Perl uses \012 as the default line ending so if you binmode an output file handle such as STDOUT, $fh, etc and then `print "blah \n"`you will write \012. This will not be correctly recognised as a line ending on DOSWin or Mac when trying to read this file - this is true for any program including perl programs. Unix however, will read this file fine as will perl running under unix. When you binmode an input filehandle such as STDIN, $fh, etc you get a raw read. Thus on DOSWin if you read in a textfile under binmode you will see that the line ending is \r\n. Visually this appears a double spaced lines. How your script behaves is system dependent. When you read from a file you read one line at a time as defined by the default system line ending - internally this will be represented as \n Let's assume you have a file that has \r\n line endings. On unix you will get an internal file with \r\n because no conversion is done. Reading the same file under DOSWin will get an internal file with only \n line endings. With binmode on an output FH when you output \r\n that is what you get. With binmode off things will differ. On unix you will still get \r\n. On DOSWin you will get \r\r\n as the \n is converted to \r\n. Unless you are writing text files across systems you do not need to worry too much. If you need binmode on the inplace edit script just binmode STDOUT. cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply]
Re: Re: Re: Re: Can anyone figure how this works? by thesundayman (Novice) on Oct 03, 2001 at 14:57 UTC
Re: Re: Re: Re: Re: Can anyone figure how this works? by tachyon (Chancellor) on Oct 03, 2001 at 18:49 UTC