in reply to [NOT] How would you decode this?

Really fast => C

If pure perl it might make sense to decode the string in all possible variations, ie:

@singlebyte= vec $string,0,8; @twobyte0= vec $string,0,16; @twobyte1= vec $string,8,16; ... @threebyte2= vec $string,16,24;

and use appropriate offsets into these arrays (or pad the arrays so that you can use one offset for all).

If there is some bias in the input so that most bytes are coded with one of the 3 methods, for example if 98% of all integers were encoded with 3 bytes, even better. You might get away with only one, two, or three of these vec calls and do the decoding of the other variants slowly without impacting the overall performance.

There is also a final and-ing neccessary to eliminate the two high bits for the 2-byte and three-byte case, but I'm sure you already thought about this

UPDATE: Not only is vec not possible with 24 bits, it also extracts only one value per call instead of handling the whole string. I should learn reading again. Without that the time savings are probably minimal to nonexistant. So something like unpack "n*" and unpack "cn*" for the 16bit values would be more appropriate. Still doesn't solve the 24-bit case

Replies are listed 'Best First'.
Re^2: [NOT] How would you decode this?
by BrowserUk (Patriarch) on Dec 28, 2010 at 15:48 UTC

    One problem is that vec doesn't handle 24-bit values.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Oh, then the perl man page I consulted is deficient: perldoc -f vec says "This must be a power of two from 1 to 32". But a quick test shows that you are right

        24 is not really a power of two :)

        24 isn't a power of 2.