Actually, newer versions of perl have broken the 'proper' behaviour of text mode on Win32. Relative to the system utilities anyway. Historically, ^Z was used to indicate the end of file in text files whose content failed to fill the last allocation block on disc on filesystems that didn't have room in their directory structures for a number to determine this information. And whilst the reason for using ^Z has fallen into history, the system utilities still honour this mechanism(*).

If you create a text file that contains a ^Z part way through:

C:\test>perl -wle" print 'the quick brown fox' for 1..3; print chr( 26 ); print 'secret message' " > junk.txt

And then use a system utility to display it:

C:\test>type junk.txt the quick brown fox the quick brown fox the quick brown fox

The system utitlties stop reading when they encounter the ^Z, just like in days of yore.

Perl used to honour this in 5.6.1 days, but has since broken the behaviour:

C:\test>perl -pe1 junk.txt the quick brown fox the quick brown fox the quick brown fox → secret message

translation to unicode entity the result of posting of course

Perl shouldn't read past a ^Z unless the file has been binmode'd. And yes, a consquence of this is that when reading unicode files, they should be read in binary mode.

I don't suppose that a 'bug' report for this would rate much attention. If the world has survived the 5(more?) years of 5.8.x without noticing this, then I don't expect much action to be taken now. But can you imagine the storm if Perl for *nix suddenly started to doing line-end translation unless you used binmode :)

(*)As a historical aside, when Mike Cowlishaw ported REXX to OS/2, he used this feature to good effect. The first time a REXX program was run, the 'compiled' binary bytecode was attached to the end of the file after the ^Z. On subsequent runs, if the bytecode was still there, it would be used rather than recompiling. As all the editors and system utilities used the MS CRT to read the files, whenever a file was copied or editied, the bytecode was transparently discarded and so it would be re-compiled whenever it was modified.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re^5: ActiveState woes : Is it EOF-blind? by BrowserUk
in thread ActiveState woes : Is it EOF-blind? by rovf

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.