in reply to Re: different length of a line from linux and windows textfile?
in thread different length of a line from linux and windows textfile?

A complaint about different line length from Windows files is likely due to "\r", which chomp will do nothing to.

s/\s+$//

Will trim all trailing whitespace, which is what I recommend over chomp. It catches "\n" and "\r" as well as spaces and tabs (which should never be allowed to have significance at the end of a line where they are extra invisible).

- tye        

  • Comment on Re^2: different length of a line from linux and windows textfile? ("\r")
  • Download Code

Replies are listed 'Best First'.
Re^3: different length of a line from linux and windows textfile? ("\r")
by SimonPratt (Friar) on Mar 17, 2014 at 16:48 UTC

    You are right that chomp wont pick up \r on its own, however it absolutely will pick up \r\n and correctly remove both characters, so the only time this would be an issue is if your file is corrupted, or specifically crafted to utilise \r in some way.

      chomp removes \r\n only on MSWin. On Linux, it only removes \n:
      ~$ perl -E '$x = "|\r\n";print $x;chomp $x; print $x' | xxd 0000000: 7c0d 0a7c 0d |..|.
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        No, you are both wrong. chomp just removes a trailing "\n" (by default). On every platform (I'm ignoring how ancient MacOS misdefined "\r", of course).

        Running Perl on Windows will likely cause a "\r\n" in a file to end up as just "\n" in your Perl string. In that case, the fact that chomp does nothing to "\r" doesn't cause a problem. But it is not chomp that is getting rid of the "\r" for you.

        You can actually make chomp get rid of "\r\n" by setting $/ = "\r\n", but that also makes chomp not get rid of "\n" (unless it is immediately preceded by "\r"). So that's a worse idea.

        s/\s+$// is a much better idea than chomp.

        - tye        

        On Windows chomp will remove both "\n" and "\r\n".

        D:\test>od -c win32.txt 0000000 l i n e o n e \r \n l i n e +t 0000020 w o 0000022 D:\test>od -c nix.txt 0000000 l i n e o n e \n l i n e t +w 0000020 o \n 0000022 D:\test>perl -pi.bak -e "chomp $_;" win32.txt D:\test>perl -pi.bak -e "chomp $_;" nix.txt D:\test>od -c win32.txt 0000000 l i n e o n e l i n e t w +o 0000020 D:\test>od -c nix.txt 0000000 l i n e o n e l i n e t w +o 0000020