PiEquals3 has asked for the wisdom of the Perl Monks concerning the following question:

I wrote a script to change non-printable characters to spaces:
(The pattern is intended to represent, from left to right:
-The ASCII range containing letters, digits, and punctuation,
-newlines, (Windows: CR-LF)
-newpage
-horizontal tab
)
$ifn=$ARGV[0];$ofn=$ARGV[1]; open IFN, "$ifn" or die "Can't open $ifn:$!\n"; open OFN, ">$ofn" or die "Can't open $ofn:$!\n"; while(<IFN>){ $_ =~ s/[^\x20-\x7e\x0d\x0a\x0c\x09]/ /g; print OFN; };

A problem arose when 0x1A occured in the file: The output file ended on the preceding character!

The name of this character is "substitute", but the control character is ^Z (the EOF marker, I believe. Forgive me, I have to use Windows NT.) My suspicion is that, upon reading this character, <> assumes EOF and leaves the loop.

Is there a way to read past this character safely? What could I do?

Replies are listed 'Best First'.
Re: Reading past an artificial EOF?
by chipmunk (Parson) on Jan 13, 2001 at 00:09 UTC
    Yes, you've deduced the problem exactly; ^Z is the EOF character on Windows. To avoid this problem, call binmode() on your filehandle before you read from it.
      That worked marvelously!

      The sterling reputation of the Perl community is re-affirmed yet again.

      Alas, I have much yet to learn..

      --
      Can an atheist be insured against acts of God?

Re: Reading past an artificial EOF?
by Chady (Priest) on Jan 13, 2001 at 00:15 UTC
    Because you are using windows, you need to set up binmode() on the file you are reading from...

    Chady | http://chady.net/
Re: Reading past an artificial EOF?
by PiEquals3 (Acolyte) on Jan 13, 2001 at 00:30 UTC
    By the way, does anyone know why that character is called "Substitue"? More off the point, whence deriveth all those low-end ASCII character names?

    Anybody maybe have a quick link?

    --
    Can an atheist be insured against acts of God?