in reply to Re: Deleting EOF from a file
in thread Deleting EOF from a file
IIRC, the "^Z" character (0x1A in hex, \032 in octal, 26. decimal) has always been used -- and is still used -- by MS systems as the byte value that marks the end of a text file. That is, if a file is being opened and read in text mode on an MS system, then there will be an EOF condition when a ^Z byte is encountered.
Of course, if the file is opened and read in binary mode, then ^Z has no special meaning, and will be treated the same as every other possible byte value. This is important, since many non-text files (containing image, audio, compressed, compiled executable or similar kinds of data) tend to contain bytes whose values happen to be 26. (i.e. 0x1a, 032, ^Z), and using text mode on such files will cause a premature EOF condition -- not good. (There are other evils that arise when treating non-text files with MS text-mode i/o, but I shouldn't digress...)
As for removing ^Z from a file, well... Obviously, if you do this globally on a non-text file, this is simply a form of data corruption -- whatever the original data may have been, it will be garbage after all the ^Z's are removed.
If, using an MS system, you want to do this on a real DOS/Windows text file (where there is just one ^Z, at the very end), I believe you would have to open both input and output files in "binary mode"; if you read such a file in text mode (like you're "supposed to"), the program would never see the ^Z -- the OS intercepts it on reading and appends it on writing, and the program handling files in text mode never sees this character. You can only read and write ^Z explicitly in your program when handling files in binary mode. (That's the main and traditional use of perl's "binmode" function, though now as of Perl 5.8, this function extends to cover other things as well, like character encoding.)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Deleting EOF from a file
by wfsp (Abbot) on Jul 16, 2004 at 06:32 UTC | |
by graff (Chancellor) on Jul 17, 2004 at 15:18 UTC | |
by wfsp (Abbot) on Jul 17, 2004 at 16:06 UTC |