in reply to Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

After putting some stuff as Anonymous (#639826)
I did some research on the topic.

What's the status of the beer meter anyway?

OK, there is no way in pure Perl to come even
close to a tailored inline solution, so if its
important one has to use it.
I compiled several snippets from this thread
into one benchmark ==> http://hobbit.chemie.uni-halle.de/project/meterofbeer/
and added some results.

One very interesting outcome for me was the
revelation of how dead slow the Core2 architecture
was at repeated assembly string opcodes
like   repne scasb. By rewriting the scasb by a sequence of "mov"
it will be blazingly fast on a Core2.

There seems to be some error in the
sub "corion" ==>http://hobbit.chemie.uni-halle.de/project/meterofbeer/beerbench.pl
maybe somebody can fix it.

This is on a Core2/Q6600@3GHz (more results in the other link):
             Rate
split1     7.35/s
ikegami1   36.6/s
substr1    45.9/s
mrm_6      1169/s
avar2      1536/s
corion     1662/s
avar2_pos  2701/s
mrm_3      2819/s
ikegami2   3137/s
bart       3480/s
ikegami3   3488/s
mrm_1      3508/s
ikegami4   3655/s
moritz     3719/s
mrm_5      4271/s
mrm_4      4346/s
rep_scasb  4495/s
inline_c  11310/s
cmp_movb  11563/s
Interesting stuff!

Thanks & bye

Mirco
  • Comment on Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
  • Download Code

Replies are listed 'Best First'.
Re^2: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
by mwah (Hermit) on Sep 19, 2007 at 21:36 UTC
    After looking through the entries here
    I found one application of memchr() (by diotalevi)

    After researching into this a bit, I
    found out this is by far the fastest
    thing on any tested platform.

    Why is that?

    By looking into the memchr() sources,
    ist can be seen that it's massively optimized
    for DWORD aligned machine word sized
    access into memory.

    memchr() is, by its assembly code, a nice piece
    of optimized code, especially the positional detection
    and extraction of singe characters.

    One can't beat this with a few lines of assembly.

    I wonder if Perl uses the underlying memchr()
    anywhere in its codebase (regex)?

    (I updated the Benchmark sources and results.)

    Regards
    Mirco