While this is absolutely correct, it does raise the question - What should be considered "characters" in a text file? To my mind, all characters, including carriage return (\r) and line feed (\n), should be counted as these do contribute to the size of the file. The difference in reported size encountered by the venerable Anonymous Monk, as gjb has rightly alluded to, is due to platform differences in the interpretation of these characters.

An alternate method of counting the number of characters in a file, including carriage return and line feeds, which should return the same result irrelevant of platform, would be:

print length do { local $/; local @ARGV = ( $file ); <> }, "\n";

Where the variable $file contains the text file name whose characters are to be counted.

 

Update

With regard to the follow-up post from Anonymous Monk, I would concur with the direction suggested of gjb in this post - It sounds as if there *really is* a difference between the files being compared on the two different machines (presumably as a result of the file transfer via FTP), hence the differing character counts.

 

perl -le 'print+unpack("N",pack("B32","00000000000000000000000111111110"))'


In reply to Re: Re: char count windows vs linux by rob_au
in thread char count windows vs linux by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.