in reply to Re^3: substr function
in thread substr function

I suspect <194> refers to a character, and that this notation is used for non-ASCII characters. If so, your last question is moot.

Replies are listed 'Best First'.
Re^5: substr function
by Jim (Curate) on Jan 13, 2011 at 17:11 UTC

    Huh?

    If <194> is a character entity that represents a "non-ASCII character," then this is precisely something that makes my question germane to the problem of measuring the length of the text in which the character entity occurs.

    For Unicode text, there are at least three valid, meaningful ways to measure the size of the text: in bytes, in characters (code points), and in grapheme clusters.

      Oops, it's makes bytes vs encoded characters moot, but characters vs graphemes is still relevant.

      By the way, there is indeed a fourth: Some characters are double-wide, so you could also talk about visual width.