saintex has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,
I have a strange behaviour on simple file reading. I would like simply read the lines, while the scripts interpoles something, or trunk the lines.

I don't know what exactly is happening, so please can you explain?

This code on Windows works as well, while on Ubuntu 64 bit (Perl 5.10) the line of code are trunked.

The file to read: directories.txt:
# text 1 # text 2 # text 3 # text 4 # text 5 /home/wolf/Scrivania/foto-recco /home/wolf/Scrivania/rescue
The code:
use FindBin qw ($Bin); my $test='hello'; my $ref2Test=\$test; my $source="$Bin/directories.txt"; open DIRECTORIES, "<",$source; while(<DIRECTORIES>) { # chomp; print "this is \$_: $_\n"; print "An that is my string:\n"; print "initial $$ref2Test middle '$_' final "; print "\n----------------------\n\n"; } close DIRECTORIES;


Output without chomp line (but I need to chmop):
this is $_: # text 1 An that is my string: initial hello middle '# text 1 ' final ---------------------- this is $_: # text 2 An that is my string: initial hello middle '# text 2 ' final ---------------------- this is $_: # text 3 An that is my string: initial hello middle '# text 3 ' final ---------------------- this is $_: # text 4 An that is my string: initial hello middle '# text 4 ' final ---------------------- this is $_: # text 5 An that is my string: initial hello middle '# text 5 ' final ---------------------- this is $_: /home/wolf/Scrivania/foto-recco An that is my string: initial hello middle '/home/wolf/Scrivania/foto-recco ' final ---------------------- this is $_: /home/wolf/Scrivania/rescue An that is my string: initial hello middle '/home/wolf/Scrivania/rescue' final ----------------------

Output with chomp:
<br /><br />this is $_: # text 1 An that is my string: ' final ello middle '# text 1 ---------------------- this is $_: # text 2 An that is my string: ' final ello middle '# text 2 ---------------------- this is $_: # text 3 An that is my string: ' final ello middle '# text 3 ---------------------- this is $_: # text 4 An that is my string: ' final ello middle '# text 4 ---------------------- this is $_: # text 5 An that is my string: ' final ello middle '# text 5 ---------------------- this is $_: /home/wolf/Scrivania/foto-recco An that is my string: ' final ello middle '/home/wolf/Scrivania/foto-recco ---------------------- this is $_: /home/wolf/Scrivania/rescue An that is my string: initial hello middle '/home/wolf/Scrivania/rescue' final ----------------------

Replies are listed 'Best First'.
Re: Strange read file behaviour
by almut (Canon) on Jun 28, 2010 at 11:26 UTC

    Your file most likely has Windows line endings, i.e. \r\n.  When you read the file on Linux, the crlf PerlIO layer which translates \r\n to \n is not in effect by default, as it is on Windows (so the \r remains), and chomp only removes what is in $/, i.e. \n.

    (\r positions the terminal's cursor to the beginning of the line, so subsequent text overwrites whatever had been written there before...)

    Fix:

    open DIRECTORIES, "<:crlf", $source or die $!; ^^^^
      Thank you all for reply.
      open DIRECTORIES, "<:crlf", $source or die $!;
      solved the problem.
      In effect I'm testing a cross platform code.
        Alternative:
        open DIRECTORIES, "<", $source or die $!; while (<DIRECTORIES>) { s/\s+\z//; # Remove trailing whitespace, incl \r and \n ... }
Re: Strange read file behaviour
by jethro (Monsignor) on Jun 28, 2010 at 11:26 UTC
    Windows text files have '\r\n' at the end of a line while linux/unix text files have only \n at the end. So if you give a windows text file to perl on linux chomp expects and removes only the \n and leaves the \r, which basically returns the cursor to the beginning of the actual line (not the next one). Example of the \r effect:

    > perl -e 'print "jacket \rp\n";' packet
Re: Strange read file behaviour
by jwkrahn (Abbot) on Jun 28, 2010 at 11:24 UTC
    I don't know what exactly is happening, so please can you explain?

    Your file directories.txt was created on a Windows machine so it has <CR><LF> ("\r\n") line endings and chomp on Unix will only remove the <LF> and leave the <CR>.

      and chomp on Unix will only remove the <LF>

      This is correct, but the "on Unix" kind of implies that this would be different on Windows — which it isn't.  On Windows it also only removes a linefeed.  But that's ok, as on Windows the crlf layer has already translated the \r\n to \n while reading in the line.

Re: Strange read file behaviour
by Marshall (Canon) on Jun 28, 2010 at 11:41 UTC
    The purpose of FindBin is: Locate the full path to the script bin directory to allow the use of paths relative to the bin directory. See FindBin.

    What is the purpose of "my $ref2Test=\$test;"

Re: Strange read file behaviour
by Anonymous Monk on Jun 28, 2010 at 10:53 UTC
    #!/usr/bin/perl -- use strict; use warnings; use Data::Dumper; print Dumper( "(((\00n\00ormal)))\n" ); { local $Data::Dumper::Useqq=1; print Dumper( "(((\00n\00ormal)))\n" ); } __END__
    $ perl temp.pl $VAR1 = '((( n ormal))) '; $VAR1 = "(((\0n\0ormal)))\n"; $ perl temp.pl |od -tacx1 0000000 $ V A R 1 sp = sp ' ( ( ( nul n nul +o $ V A R 1 = ' ( ( ( \0 n \0 +o 24 56 41 52 31 20 3d 20 27 28 28 28 00 6e 00 6f 0000020 r m a l ) ) ) cr nl ' ; cr nl $ V +A r m a l ) ) ) \r \n ' ; \r \n $ V +A 72 6d 61 6c 29 29 29 0d 0a 27 3b 0d 0a 24 56 41 0000040 R 1 sp = sp " ( ( ( \ 0 n \ 0 o +r R 1 = " ( ( ( \ 0 n \ 0 o +r 52 31 20 3d 20 22 28 28 28 5c 30 6e 5c 30 6f 72 0000060 m a l ) ) ) \ n " ; cr nl m a l ) ) ) \ n " ; \r \n 6d 61 6c 29 29 29 5c 6e 22 3b 0d 0a 0000074 $ perl temp.pl |hexdump 00000000: 24 56 41 52 31 20 3D 20 - 27 28 28 28 00 6E 00 6F |$VAR1 = ' +((( n o| 00000010: 72 6D 61 6C 29 29 29 0D - 0A 27 3B 0D 0A 24 56 41 |rmal))) +'; $VA| 00000020: 52 31 20 3D 20 22 28 28 - 28 5C 30 6E 5C 30 6F 72 |R1 = "((( +\0n\0or| 00000030: 6D 61 6C 29 29 29 5C 6E - 22 3B 0D 0A |mal)))\n" +; | 0000003c;