in reply to Re^27: Interleaving bytes in a string quickly
in thread Interleaving bytes in a string quickly

Only the meaning assigned to that value:

Your demo actually shows that there is no change in meaning. It's %d or %u that controls what gets displayed, so %d and %u is what gives the value meaning.

I will grant you that there is no change in value for some purposes. i==j is true, so the value is the same as far as == is concerned. However, i<5 == j<5 is false, so the value isn't different as far as < is concerned.

It's still moot, though, since 8-bit strings are a proper subset of 32/64-bit strings.

I've been pointing out right from the very beginning, that I'm not interested in getting "the encoded version"

Exactly, and SvPVX can return the encoded version. At least without the limitation you added a couple of posts ago.

that can never be safely or logically treated as any form of unicode.

I don't know why you keep bringing up unicode.

  • Comment on Re^28: Interleaving bytes in a string quickly

Replies are listed 'Best First'.
Re^29: Interleaving bytes in a string quickly
by BrowserUk (Patriarch) on Mar 01, 2010 at 17:23 UTC
    <iand SvPVX can return the encoded version.

    Gaaaaaaaaah! No it can't. It just returns a pointer to some memory. It places no interpretation upon what it is that memory. And neither does my code.

    I don't know why you keep bringing up unicode.

    Look again. Prior to the quoted post, I mentioned it once, and you mentioned it once.

    That aside, isn't utf-8 a "form of unicode."?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      It places no interpretation upon what it is that memory.

      I know! You've said this a dozen times already. And what format is that pointed memory in? You only guaranteed the format in a node 20 deep or so.

      It places no interpretation upon what it is that memory.

      That aside, isn't utf-8 a "form of unicode."?

      Look again.

      Ah yes, you only said "codepoint", not "unicode". That usually mean "unicode codepoints", but you didn't imply any character semantics.

      That aside, isn't utf-8 a "form of unicode."?

      Unicode is a character set. You're clearly not dealing with characters.

      UTF-8 is a storage format. Typically, it's used to encode unicode characters, but Perl uses it internally to encode 32-bit or 64-bit integers (depending on your build). Those integers may be codepoints, but that applies to UTF8=0 strings too.

        You only guaranteed the format in a node 20 deep

        No. I guarenteed that a) in the title of the thread; b) when I wrote the code.

        UTF-8 is a storage format. Typically, it's used to encode unicode characters, but Perl uses it internally to encode 32-bit integers (or 64-bit on a 64-bit build, I think).

        Please demonstrate. Cos if that is true, it is something that has completely eluded me.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.