Rather than binmode text files, you should instead learn that "file size" only equals "number of bytes when the file is read into memory" when the file is a simple stream of bytes. Although that is very common on Unix, it is nearly uncommon outside of Unix.
For another example where this isn't true, consider Unix directories. They are files but they don't simply contain a stream of bytes and the "size" won't match the number of bytes you get back when you "read" them (some Unix systems will let you read a directory as a stream of bytes, but that isn't what you are supposed to do with them).
A great many types of systems don't routinely store files as simple streams of bytes (and even some that support that won't report file size to match your expectations).
It is quite common to have files recorded as a series of records. And record separators can have a length of 0 (for fixed-length records, for example) or a longer length (such as preceeding the record by the length of the record) or even a variable length (such as when records are indexed). Now, Unix takes a minimalist approach (which I think turned out to be a really good idea) and implements any of the above schemes on top of the file system's idea that all files are simply a stream of bytes. So when you read an ordinary file on Unix, you just get that same stream of bytes.
But these other systems track record boundaries "outside" of the data of the file (which allows you to put a "\n" inside your record, which probably doesn't seem like a big deal to you since you've spent your entire computing lifespan thinking about files as streams of bytes). This file meta data may or may not be included in the "size" that -s gives back to you. Whether it does or not is really a matter convenience/efficiency.
Even non-oridinary files on Unix don't stores simple streams of bytes.
In Unix, the file isn't actually stored as a stream of bytes. It is probably stored as a bunch sectors thrown willy-nilly about the disk. But the Unix file system presents these to the program/programmer as a stream of bytes. So even when a Unix file has a chunk missing from the middle that is not recorded to disk, Unix zero-fills these when it is read and also shows the "file size" as the number of bytes that you'd have after this has been done so your comparison still succeeds in this case.
So please, just stop comparing "number of bytes read" to what -s says. It isn't portable. Even if you use binmode, you'll run into (somewhat rare) cases where this doesn't work. Even when you have an ordinary file on Unix, there are race conditions to consider.
binmode on text files is usually a bad idea. Comparing -s to number of bytes read is always a bad idea in my book.
- tyeIn reply to Re: Remember to binmode text files (wrong test/conclusion)
by tye
in thread Remember to binmode text files
by diotalevi
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |