in reply to Unexpected: unpack slower than several substr's. Why?

BazB

I can't reproduce your results. I get unpack to be substantially faster than multiple substrs?

Rate substr_it unpack_it substr_it 2341/s -- -47% unpack_it 4384/s 87% -- Results tally

Without sight of your full benchmark code it's difficult to see where the discrepancy arises.

One difference is that I am accessing the source string via an alias rather than copying it, but the overhead should be the same in both cases. Another difference is that I am returning the results list from unpack directly rather than assigning it to a local array, and then building the return list from that, but the difference seems to big for that to account for all of it.

Maybe you could try your benchmark with these to changes and see if you still get the opposite result?

Benchmark and full results.

#! perl -slw use strict; use vars qw[$line @s @s2 @u1 @u2]; use Benchmark qw[cmpthese]; $line = (join'','a' .. 'y') x 40; sub substr_it{ my @bits; push @bits, substr $_[0], 413, 30; push @bits, substr $_[0], 373, 30; push @bits, substr $_[0], 343, 30; push @bits, substr $_[0], 245, 15; push @bits, substr $_[0], 679, 30; push @bits, substr $_[0], 10, 10; push @bits, substr $_[0], 900, 100; push @bits, substr $_[0], 500, 2; push @bits, substr $_[0], 313, 30; push @bits, substr $_[0], 844, 44; push @bits, substr $_[0], 111, 1; push @bits, substr $_[0], 100, 100; push @bits, substr $_[0], 454, 30; push @bits, substr $_[0], 240, 3; push @bits, substr $_[0], 236, 4; @bits; } sub unpack_it{ my $fmt = '@413A30 @373A30 @343A30 @245A15 @679A30 @10A10 @900A100 +' . '@500A2 @313A30 @844A44 @111A1 @100A100 @454A30 @240A3 @ +236A4'; unpack( $fmt, $_[0]); } cmpthese( -3, { substr_it => '@s = substr_it $line;', unpack_it => '@u1 = unpack_it $line;', }); print 'Results tally'; __END__ C:\test>226666 Benchmark: running substr_it, unpack_it , each for at least 3 CPU seconds ... substr_it: 4 wallclock secs ( 3.01 usr + 0.00 sys = 3.01 CPU) @ 23 +40.74/s (n=7055) unpack_it: 3 wallclock secs ( 3.11 usr + 0.00 sys = 3.11 CPU) @ 43 +83.55/s (n=13646) Rate substr_it unpack_it substr_it 2341/s -- -47% unpack_it 4384/s 87% -- Results tally C:\test>

Update:

Indeed, it seems to be avoiding the allocation of a local array and then using that to build the return list that accounts for most of the savings. In line with MarkM's suggestion, I change the substr version to avoid the array and build the return list in situ ...

sub substr_it{ ( substr( $_[0], 413, 30), , substr( $_[0], 373, 30), , substr( $_[0], 343, 30), , substr( $_[0], 245, 15), , substr( $_[0], 679, 30), , substr( $_[0], 10, 10), , substr( $_[0], 900, 100), , substr( $_[0], 500, 2), , substr( $_[0], 313, 30), , substr( $_[0], 844, 44), , substr( $_[0], 111, 1), , substr( $_[0], 100, 100), , substr( $_[0], 454, 30), , substr( $_[0], 240, 3), , substr( $_[0], 236, 4), ); }

... and this resulted in a substantial improvement in the performance of the substr version, though it still falls 16% short of unpack.

Rate substr_it unpack_it substr_it 3836/s -- -14% unpack_it 4460/s 16% -- Results tally

Update2: You can gain a few more (6-10%) improvement in the performance of the unpack version by making the format a constant sub rather than a my'd var.

use constant FORMAT => '@413A30 @373A30 @343A30 @245A15 @679A30 @10A10 @900A100 @500A2 . @313A30 @844A44 @111A1 @100A100 @454A30 @240A3 @236A4'; sub unpack_it{ unpack( FORMAT, $_[0]); } # gives Rate substr_it unpack_it substr_it 3821/s -- -16% unpack_it 4575/s 20% -- Results tally

Which may be worth having given the size of the dataset you are processing.


Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Replies are listed 'Best First'.
Re: Re: Unexpected: unpack slower than several substr's. Why?
by MarkM (Curate) on Jan 14, 2003 at 02:53 UTC

    'push(@array, scalar)' (your implementation) is almost 2X slower than '$array[fixed_index] = scalar' (his implementation).

    When verifying other people's benchmarks, try not to alter their code. :-)

      Actually, I had previously tried it using assignment rather than push and the difference was so negligable as to be within the bounds of varience from one run to the next.

      C:\test>226666 Benchmark: running substr_it, unpack_it , each for at least 3 CPU seconds ... substr_it: 4 wallclock secs ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 23 +75.99/s (n=7520) unpack_it: 4 wallclock secs ( 3.38 usr + 0.00 sys = 3.38 CPU) @ 43 +71.56/s (n=14754) Rate substr_it unpack_it substr_it 2376/s -- -46% unpack_it 4372/s 84% -- Results tally C:\test>

      I did verify the original as best I could, but as he didn't supply the whole benchmark I had to make a few guesses.

      I also attempted to improve both as far as I could as you will see from the update to my first post based upon your suggestion :^)

      Older benchmark code for the above results

        I'd have to take back my educated guess and go with BrowserUk. I tried this on Solaris and Linux using several different variations and the substr version always comes out over 40% slower. And I did not see the push being twice as slow as the explicit array assignments on either system.

        Besides platform, the data being operated on would be the only other factor I can think of that might make the original so much different from what others have tried.

Re: Re: Unexpected: unpack slower than several substr's. Why?
by BazB (Priest) on Jan 14, 2003 at 20:55 UTC

    Thanks to everyone who replied(++!).
    Some very interesting comments.

    The suggestion not to use a lexically scoped/my'ed array, but instead to just build up the return list and no more made some serious differences.

    The code is generated on the fly, then eval'ed once to create the anonymous subroutines, then called multiple times. The eval was not included in any benchmarking.

    Using a constant for the format, rather than a string in a scalar isn't needed, because of the eval.

    I have a feeling some of this problem is due to the fact that the perl I use at home is 5.8.0 with threading.
    The 5.6.1 I use at work shows more of an improvement between different implementations.

    I've got benchmarks for 5.6.1 and 5.8.0 (in the readmore section below) for comparison.

    A_unpack_line_no_array uses pack("A...", $line)
    unpack_line_no_array uses pack("a...", $line).

    Update: changed pre tags to code tags.