in reply to Re^48: Interleaving bytes in a string quickly
in thread Interleaving bytes in a string quickly

If you're talking about memory usage, yes.
  • Comment on Re^49: Interleaving bytes in a string quickly

Replies are listed 'Best First'.
Re^50: Interleaving bytes in a string quickly
by BrowserUk (Patriarch) on Mar 03, 2010 at 06:47 UTC

    Then, doesn't it make sense to stick to the terminology that is used elsewhere (e.g. perlfunc "Note the characters: depending on the status of the socket, either (8-bit) bytes or characters are received."). With the logical extension that strings with (UTF8=0) are 'byte strings'. And strings with UTF8=1 are 'character strings'.

    At the Perl level of course. At the C-level, they are all just byte arrays until you give them to something that attempts to interpret them differently.

    Which would make this piece oxymoronology:

    If sv contains the byte string "\x80\x81", the pointer returned by SvPVX(sv) points to one of the following:
    • two bytes 80 81*
    • three bytes 80 81 00
    • four bytes C2 80 C2 81*
    • five bytes C2 80 C2 81 00

    Read as:

    If sv contains the character string "\x80\x81", the pointer returned by SvPVX(sv) points to one of the following:
    • two bytes 80 81*
    • three bytes 80 81 00
    • four bytes C2 80 C2 81*
    • five bytes C2 80 C2 81 00

    (You still haven't demonstrated how you can get those four to exist with only two string types)

    But never mind, it doesn't matter, because had you been that clear, it would have then killed this thread stoney dead way back at level 5 when I said, I was only interested in dealing with "byte strings". Because it simply doesn't make sense to interleave (multi-byte) characters with constant bytes in order to pad them to 10-bit HDMI entities within 16-bit fields.

    If you make up terminology, and ignore all my protestations that: I know what I'm doing; and I'm doing what I know to be right; stupid threads like this are the result.

    I'd still like to see a good (realistic), use case for strings of variable width packed integers, but since they don't work, it doesn't really matter.

    And this really is my last post in this thread.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      (You still haven't demonstrated how you can get those four to exist with only two string types)

      You know that +1 and the assigned \0 in your code, don't allocate +1 and don't put \0 and there you go.

      The \0 is not required, but much XS assume it's there.

      Read as: If sv contains the character string "\x80\x81",

      I don't understand. How is calling a byte string a character string useful? Why would you do that? It's not technically wrong, but it's definitely not clearer.

      I realize that C calls both bytes and characters char, but I don't think that's something to aspire to.

        Aaaaaaaaaaaaaaarg! Where is your head at?

        I don't understand. How is calling a byte string a character string useful?

        It's not "calling a byte string a character string". It's calling a string containing multi-byte characters, a character string.

        Because the only way SvPVX() can return a pointer to more than 2 bytes of ram at the C-level, is if the SV contains a scalar containing 2 (multi-byte) characters at the Perl level.

        If the scalar contains a string containing just two bytes, then SvPVX() will return a pointer to just exactly 2 bytes. It can do nothing else, as we've at length established, because SvPVX() performs no coercions.

        (You still haven't demonstrated how you can get those four to exist with only two string types)

        You know that +1 and the assigned \0 in your code, don't allocate +1 and don't put \0 and there you go.

        Now you're either just taking the piss, or you're talking gibberish.

        Because there is no +1 in my code, and the only assignment of \0 is into the output buffer. Which has nothing whatsoever to do with the input accessed via SvPVX().

        The "those four to exist" to which I refer, are the four possibilities that you claimed SvPVX() could return a pointer to, variously, either 2, 3, 4 or 5 bytes of ram.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.