A lesson for me, that is, which I thought I'd share as a warning to others. :)

I needed to read and write data in a legacy format that, for reasons that may have seemed compelling in the 1970s, stores data "efficiently": data is 7-bit ASCII, and rather than waste a whole bit of every octet, 8 chars are packed into every 7 bytes.

pack can't handle this directly: it only does whole bytes. vec only does powers-of-2. If Bit::Vector can do it then I sure can't figure out how (but the documentation for that module makes my head explode at the best of times).

Fortunately, however, this type of encoding is still widely used in one everyday field: SMS. I know there are some SMS modules on CPAN. A quick search throws up Device::Gsm::Pdu and GSM::SMS::PDU. I look inside.

I didn't much like the look of either, to be honest. They're both full of stuff like

unpack 'b8', pack 'H2', substr $foo, 0, 2

which just screamed "inefficient" to me. I was sure I could reinvent this wheel better.

The naive way to do this is to unpack one's data into a bit string, insert/remove the zeroes in the eighth bit, and then pack the bits back together again:

sub packwithsubst { my $string = shift; my $bits = unpack 'b*', $string; $bits =~ s/(.{7})./$1/g; return pack 'Cb*', length($string), $bits; } sub unpackwithsubst { my ($length, $chars) = unpack 'Cb*', shift; $chars =~ s/(.{7})/${1}0/g; substr $chars, $length * 8, 8, ''; return pack 'b*', $chars; }

I stuck this with the CPAN code in a little script, and benchmarked it on the first 250-odd characters of "To be or not to be". And sure enough, my code was a lot faster:

Rate GSM::SMS GSM::SMS 324/s -- Device::Gsm 342/s 6% My_code 514/s 59%

But this was hardly a fair comparison; both the CPAN versions were going via a wholly unnecessary hexadecimal conversion step. What if I simply tried to improve the existing code? That is to say, used this:

my (%b2c, %c2b); $b2c{unpack 'b8', $_} = $_ for map chr, 0..255; $c2b{$_} = unpack 'b7', $_ for map chr, 0..127; sub packdevicegsm2 { # Device::Gsm::Pdu:encode_text7, not going via he +x my @char = split //, shift; my($result, $bits); map { $bits .= $c2b{$_} } @char; if (my $len = length($bits) % 8) { $bits .= '0' x (8 - $len) } $result .= $b2c{ substr $bits, 0, 8, '' } while length $bits; return chr(scalar @char) . $result; } sub unpackdevicegsm2 { # Device::Gsm::Pdu:decode_text7, not going via +hex my $text7 = shift; my $len = ord substr( $text7, 0, 1, '' ); my $bits = ''; $bits .= unpack 'b8', substr $text7, 0, 1, '' while length $text7; my $decoded = ''; $decoded .= pack 'b7', substr $bits, 0, 7, '' while length $bits > += 7; return substr $decoded, 0, $len; }

And, to my surprise:

Rate GSM::SMS GSM::SMS 322/s -- Device::Gsm 345/s 7% My_code 519/s 61% Dev::Gsm_opt 519/s 61%

Apparently loops and substr can be as fast as s///g. That was unexpected. And unwelcome. My pride lay in tatters. What had I achieved?!

Time to optimise.

In true hubristic style, I decided (without profiling) that the problem was probably the s///gs. They're doing a lot of copying. I could avoid them when packing by telling unpack to output only the 7 bits I was interested in; to do so I'd have to treat each byte separately, so I'd need a join, but those are fairly fast. There's no way to do that in reverse, but maybe splitting up the bits into an array would be quicker than inserting all those zeroes?

sub packwitharray { my $string = shift; return pack 'Cb*', length($string), join('', unpack '(b7)*', $stri +ng); } sub unpackwitharray { my ($length, $bits) = unpack 'Cb*', shift; my @chars = $bits =~ /.{7}/g; $#chars = $length - 1; return pack '(b7)*', @chars; }

Woo! The packing was golfed down to two lines! And it was faster, too:

Rate GSM::SMS GSM::SMS 321/s -- Device::Gsm 342/s 7% Dev::Gsm_opt 500/s 56% Subst/Subst 519/s 62% # This was called My_code above Subst/Array 527/s 64% # i.e. packwithsubst / unpackwitharray Array/Subst 777/s 142% Array/Array 808/s 152%

I was back in the lead! A 150% speed boost over the CPAN code was not to be sniffed at, surely? My code was awesome! It was probably faster than C!

Rate My_code C My_code 792/s -- -99% C 77422/s 9670% --

And I was enlightened.

Full benchmark listing, as spoiler for convenience (it's 260 lines):


In reply to Sweating the small stuff: a lesson in optimisation by Porculus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.