in reply to Re^5: Extracting Data from a second line
in thread Extracting Data from a second line

The number returned by that program is still larger than it needs to be in Windows. There are 41 characters to read in Windows, but $size_of_DATA is 44.

Replies are listed 'Best First'.
Re^7: Extracting Data from a second line
by kwaping (Priest) on Feb 16, 2006 at 19:37 UTC
    Must be a windows vs. unix issue then. The numbers work out on my system. Also, in your previous code, the numbers were all 560 on my system.

      Of course it is. In text mode in Windows, \x0D\x0A is two bytes, but the single character.

      But it's not just a Windows issue; it's also an encoding issue. If you set the encoding of DATA or of your FILE to be a multi-byte encoding such as UTF-8, then you'll have a discrepency between the number of characters and the number of bytes (even on unix) if you have any multi-byte characters in your DATA or in your FILE.

      The whole point is that there is no way of knowing the number of characters in a file, since there's no relation between the size of the file and the number of characters in it.

      It is therefore useless to "optimize" the value passed to read's third parameter, a number of characters (not bytes outside of :raw mode). Any value at least as large as the number of bytes in the file will do nicely.

        So, what is the answer then? When reading DATA, do we continue to manually count the characters and use that? Or do we use a number that we know will be sufficient (1000000 or something similar) but is guaranteed too big? These are more rhetorical questions than anything. I would just like a way to find the size of DATA, programmatically.