BazB has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to squeeze some more speed out of some code that I've got.

The code in question carries out 15 substr operations for each line of input.
I was expecting a single unpack to be faster, however this wasn't the case.

Could anyone explain what's going on?

Which modules could help trace this sort of issue? Things like Devel::DProf aren't quite the right tool.

Cheers.

BazB.


Benchmarks and code follow...

This is perl, v5.8.0 built for i386-linux-thread-multi
Benchmark: timing 100000 iterations of substr, unpack... Rate unpack substr unpack 40486/s -- -28% substr 55866/s 38% --
The code containing the pack/substr is generated using positions slurped from a data dictionary, then eval'ed producing the following:
sub { my $line = shift; my @data; $data[0] = substr($line, 413, 30); $data[1] = substr($line, 373, 30); $data[2] = substr($line, 343, 30); # etc... $data[12] = substr($line, 454, 30); $data[13] = substr($line, 240, 3); $data[14] = substr($line, 236, 4); return @data; }; #------------------------------------- sub { my $line = shift; my @data = unpack('@413A30 @373A30 @343A30 # etc... @454A30 @240A3 @236A4', $line); };

Replies are listed 'Best First'.
Re: Unexpected: unpack slower than several substr's. Why?
by BrowserUk (Patriarch) on Jan 14, 2003 at 02:42 UTC

    BazB

    I can't reproduce your results. I get unpack to be substantially faster than multiple substrs?

    Rate substr_it unpack_it substr_it 2341/s -- -47% unpack_it 4384/s 87% -- Results tally

    Without sight of your full benchmark code it's difficult to see where the discrepancy arises.

    One difference is that I am accessing the source string via an alias rather than copying it, but the overhead should be the same in both cases. Another difference is that I am returning the results list from unpack directly rather than assigning it to a local array, and then building the return list from that, but the difference seems to big for that to account for all of it.

    Maybe you could try your benchmark with these to changes and see if you still get the opposite result?

    Benchmark and full results.

    Update:

    Indeed, it seems to be avoiding the allocation of a local array and then using that to build the return list that accounts for most of the savings. In line with MarkM's suggestion, I change the substr version to avoid the array and build the return list in situ ...

    sub substr_it{ ( substr( $_[0], 413, 30), , substr( $_[0], 373, 30), , substr( $_[0], 343, 30), , substr( $_[0], 245, 15), , substr( $_[0], 679, 30), , substr( $_[0], 10, 10), , substr( $_[0], 900, 100), , substr( $_[0], 500, 2), , substr( $_[0], 313, 30), , substr( $_[0], 844, 44), , substr( $_[0], 111, 1), , substr( $_[0], 100, 100), , substr( $_[0], 454, 30), , substr( $_[0], 240, 3), , substr( $_[0], 236, 4), ); }

    ... and this resulted in a substantial improvement in the performance of the substr version, though it still falls 16% short of unpack.

    Rate substr_it unpack_it substr_it 3836/s -- -14% unpack_it 4460/s 16% -- Results tally

    Update2: You can gain a few more (6-10%) improvement in the performance of the unpack version by making the format a constant sub rather than a my'd var.

    use constant FORMAT => '@413A30 @373A30 @343A30 @245A15 @679A30 @10A10 @900A100 @500A2 . @313A30 @844A44 @111A1 @100A100 @454A30 @240A3 @236A4'; sub unpack_it{ unpack( FORMAT, $_[0]); } # gives Rate substr_it unpack_it substr_it 3821/s -- -16% unpack_it 4575/s 20% -- Results tally

    Which may be worth having given the size of the dataset you are processing.


    Examine what is said, not who speaks.

    The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

      'push(@array, scalar)' (your implementation) is almost 2X slower than '$array[fixed_index] = scalar' (his implementation).

      When verifying other people's benchmarks, try not to alter their code. :-)

        Actually, I had previously tried it using assignment rather than push and the difference was so negligable as to be within the bounds of varience from one run to the next.

        C:\test>226666 Benchmark: running substr_it, unpack_it , each for at least 3 CPU seconds ... substr_it: 4 wallclock secs ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 23 +75.99/s (n=7520) unpack_it: 4 wallclock secs ( 3.38 usr + 0.00 sys = 3.38 CPU) @ 43 +71.56/s (n=14754) Rate substr_it unpack_it substr_it 2376/s -- -46% unpack_it 4372/s 84% -- Results tally C:\test>

        I did verify the original as best I could, but as he didn't supply the whole benchmark I had to make a few guesses.

        I also attempted to improve both as far as I could as you will see from the update to my first post based upon your suggestion :^)

        Older benchmark code for the above results

      Thanks to everyone who replied(++!).
      Some very interesting comments.

      The suggestion not to use a lexically scoped/my'ed array, but instead to just build up the return list and no more made some serious differences.

      The code is generated on the fly, then eval'ed once to create the anonymous subroutines, then called multiple times. The eval was not included in any benchmarking.

      Using a constant for the format, rather than a string in a scalar isn't needed, because of the eval.

      I have a feeling some of this problem is due to the fact that the perl I use at home is 5.8.0 with threading.
      The 5.6.1 I use at work shows more of an improvement between different implementations.

      I've got benchmarks for 5.6.1 and 5.8.0 (in the readmore section below) for comparison.

      A_unpack_line_no_array uses pack("A...", $line)
      unpack_line_no_array uses pack("a...", $line).

      Update: changed pre tags to code tags.

Re: Unexpected: unpack slower than several substr's. Why?
by MarkM (Curate) on Jan 14, 2003 at 02:20 UTC

    There are many factors to consider when comparing the performance of your two models.

    First, substr() and unpack() behave differently:

    • substr() copies the sub-string.
    • unpack('A30') copies the sub-string, scans it to locate the end of the space padded string, and shrinks the sub-string such that it does not include the padding.

    Secondly, Perl treats scalar assignments different from list assignments. In fact, the scalar assignment op-code is implemented much simpler. Benchmark the difference between "($a) = (1)" and "$a = 1" to see the effect that this has. Although 15+ scalar assignments may be more expensive than a list assignment that involves 15+ values, the overhead of a list assignment dampens the overhead of having many scalar assignments.

    The only true way of comparing the two approaches, is the method that you have chosen -- benchmarking.

    If you need to improve the performance of your substr() model, one (ugly) approach may be to use one large list assignment ("@data = (substr(), substr(), ...)") the same as you do for unpack(). Again, only benchmarking can tell... :-)

    UPDATE: substr() only uses magic if called in lvalue context.

Re: Unexpected: unpack slower than several substr's. Why?
by theorbtwo (Prior) on Jan 14, 2003 at 02:17 UTC

    I suspect the reason is that while substr() can just copy the data, whereas unpack() must recode, and check for more strange conditions. Also, try a (rather then A), Z, and z.


    Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

Re: Unexpected: unpack slower than several substr's. Why?
by steves (Curate) on Jan 14, 2003 at 01:46 UTC

    My educated guess would be that since unpack is more general purpose, and since it has to parse the format specification(s) before it knows what to do, that a small set like 15 would be faster with substr calls. substr is simple and specific enough that many of us can write it in our head. Probably not so with unpack.

    You're doing the right thing by using benchmarks to make your decisions. Sometimes what seems obvious isn't when it comes to performance.