in reply to Re^42: Interleaving bytes in a string quickly
in thread Interleaving bytes in a string quickly

I'm not sure what you're expecting from printing non-characters. Switching to Dump shows the right output is returned:
SV = PV(0x98d16d0) at 0x98d4760 REFCNT = 1 FLAGS = (TEMP,POK,pPOK,UTF8) PV = 0x99243c0 "\320\200\340\240\200\341\200\200\342\200\200\344\200 +\200\350\200\200\360\220\200\200\360\240\200\200\361\200\200\200\362\ +200\200\200"\0 [UTF8 "\x{400}\x{800}\x{1000}\x{2000}\x{4000}\x{8000}\ +x{10000}\x{20000}\x{40000}\x{80000}"] CUR = 33 LEN = 36

Note that you need the latest version. 1.0 only supported string of bytes in the 8-bit string format.

Replies are listed 'Best First'.
Re^44: Interleaving bytes in a string quickly
by BrowserUk (Patriarch) on Mar 03, 2010 at 00:58 UTC

    I'm expecting the offset in the second string to be 10, not 13!

      2^10 = 0x400, so you got the right char.

        It's still broken if the offsets are wrong.

Re^44: Interleaving bytes in a string quickly
by BrowserUk (Patriarch) on Mar 03, 2010 at 01:47 UTC
    8-bit string format

    BTW: I'm not sure where you got it from, or if it is just a language thing, but that is a nonsensical term. An "8-bit string" would be 1 byte long.

    As is "the 32/64-bit string format". 4 or 8 bytes respectively.

    An '8-bit character string format' maybe. More usually known simply as "a byte string".

    And "32-bit/64-bit character string", though that's still not right because the characters can be "upto nn-bits". But, of course you can't have a character with a non-power of 8 bits.

    So, "varible length character string", but that sounds like the string is variable length rather than the characters. Which I guess is why they are usually referred to as "Unicode strings" or "Wide character strings". But neither of those is quite right for these peculiar, useless beasties.

    So, how about "Variable width character strings". A quick google shows a few other have hit upon that.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      More usually known simply as "a byte string".

      A "byte string" means the same as a "string of bytes" to me (flour box = box of flour), and that's not what I meant by "8-bit string format". I was referring to one of Perl's string format, not what the value of the string.

        Then you'll have to clarify what you do mean, rather than telling us what you don't.

        Because that phrase means nothing to me; nor anything apparently related, (other than this thread), to google.

        I was referring to one of Perl's string format, not what the value of the string.

        And I'm being neither rude nor pedantic when I say, that sentence doesn't parse to anything that I can make sense of either.