Really fast => C

If pure perl it might make sense to decode the string in all possible variations, ie:

@singlebyte= vec $string,0,8; @twobyte0= vec $string,0,16; @twobyte1= vec $string,8,16; ... @threebyte2= vec $string,16,24;

and use appropriate offsets into these arrays (or pad the arrays so that you can use one offset for all).

If there is some bias in the input so that most bytes are coded with one of the 3 methods, for example if 98% of all integers were encoded with 3 bytes, even better. You might get away with only one, two, or three of these vec calls and do the decoding of the other variants slowly without impacting the overall performance.

There is also a final and-ing neccessary to eliminate the two high bits for the 2-byte and three-byte case, but I'm sure you already thought about this

UPDATE: Not only is vec not possible with 24 bits, it also extracts only one value per call instead of handling the whole string. I should learn reading again. Without that the time savings are probably minimal to nonexistant. So something like unpack "n*" and unpack "cn*" for the 16bit values would be more appropriate. Still doesn't solve the 24-bit case


In reply to Re: [NOT] How would you decode this? by jethro
in thread [NOT] How would you decode this? by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.