My understanding has always been that the substring of a string should share its utf8'ness, and that once turned on it stays on until explicitly turned off.

It makes sense to me that once turned on in a particular string, it should stay on (e.g., if the string is modified via $str =~ s///). Functions that give you substrings (e.g., substr, split) create new strings for these, so there is no "staying on" to be done. The flag value would have to be intentionally propagated from one string to another.

Even a basic assignment creates a new string, but one would hope one of the properties of an assignment operator is that it duplicates both a variable's data and its metadata. (Yet even this fairly straightforward fact is not documented in perlop.)

Anything else I consider a bug.
By that logic, the behavior hv points out in Re^7: Seeking Perl docs about how UTF8 flag propagates is a bug. But since no documentation supports your expectation, I'm not sure you could make a case for that.
Having said that, it is good form to treat it as an uncertain value
Yeah, that's what I'm doing now, in response to this thread. Thanks to everyone who's chimed in.

In reply to Re^2: Seeking Perl docs about how UTF8 flag propagates by raygun
in thread Seeking Perl docs about how UTF8 flag propagates by raygun

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.