in reply to Re: Re: Genesis of a sort routine
in thread Genesis of a sort routine

Dunno, in my benchmark (performed after I wrote the node, of course), they seem to come out pretty even, so I'd be likely to go with the 'optimized regex' answer given by the AM. Did you have different results? Maybe my benchmark wasn't very good...

--
3dan

Replies are listed 'Best First'.
Re+: Genesis of a sort routine
by BrowserUk (Patriarch) on Nov 06, 2003 at 10:30 UTC

    Hmm. I originally used a debug session & timethese(), did a little mental arithmetic, that jelled with what Anonymonk said, so I went with it.

    Now I've put it into a proper script, used cmpthese() to do the arithmetic, and I can't reproduce my original results. In fact, I seem to consistantly get the index version coming out quicker, sometimes markedly so??

    Maybe I did the arithmetic wrong, or there is a flaw in my benchmark? I used random strings with (or without) randomly positioned ':' to try cover the failing conditions as well as a good spread of the passing ones. The spread of variation in timings seem to indicate that some combinations cause the regex engine to take much longer than others. I've not managed to get the difference less than 9% in favour of index, and on one occasion, it went as high as 28%. Further investigation is called for:)

    #! perl -slw use strict; use Benchmark qw[ cmpthese ]; sub rndStr{ join'', map{ $_[ rand scalar @_ ] } 0 .. shift } our @strings = map{ rndStr 8, ':', 'a' .. 'z' } 1 .. 1000; our( @a, @b ); cmpthese( -3, { regex => q[ our @a = sort{ ( $a =~ /:/ <=> $b =~ /:/ ) || $a cmp $b } @strings ], index => q[ our @b = sort{ ( ( index($a,':') >= 0 ) <=> ( index($b,':') >= 0) ) || $a cmp $b; } @strings ], }); print 'Okay' if "@a" eq "@b"; __END__ P:\test>test3 Rate regex index regex 14.6/s -- -9% index 16.1/s 10% -- Okay P:\test>test3 Rate regex index regex 13.7/s -- -15% index 16.2/s 18% -- Okay P:\test>test3 Rate regex index regex 14.5/s -- -11% index 16.3/s 13% -- Okay P:\test>test3 Rate regex index regex 13.1/s -- -19% index 16.2/s 23% -- Okay

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Hooray!
    Wanted!

      Interesting. Running that code exactly gives me a win for 'regex' on both 5.6.1/Solaris-7 running on a Sun and 5.8.0/Linux-2.4.18 running on a PC. On the Sun, the win is usually large: between 15% and 21%. On the Linux box, it's usually only very small: 0% to 2%.

      -sauoq
      "My two cents aren't worth a dime.";
      

        Intriguing. Maybe it comes down to the underlying C-runtime code? I just tried it with AS 5.8.0, AS 5.6.1 and 5.8.1 built with Borland, and the results always favour index... They also highlight the penalty of unicode support:(.

        P:\test>perl5.8.0 test3.pl8 Rate regex index Abi_regex Abi_index regex 14.6/s -- -10% -14% -21% index 16.3/s 11% -- -4% -13% Abi_regex 17.0/s 16% 4% -- -9% Abi_index 18.6/s 27% 15% 10% -- Okay P:\test>e:\perl5.8.1\bin\perl5.8.1 test3.pl8 Rate regex index Abi_regex Abi_index regex 13.1/s -- -11% -27% -31% index 14.7/s 12% -- -18% -23% Abi_regex 17.9/s 37% 22% -- -6% Abi_index 19.0/s 45% 30% 6% -- Okay P:\test>perl5.6.1 test3.pl8 [snip] Rate regex index Abi_regex Abi_index regex 13.8/s -- -11% -50% -51% index 15.5/s 13% -- -44% -45% Abi_regex 27.6/s 100% 77% -- -3% Abi_index 28.4/s 106% 83% 3% -- Okay

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        Hooray!
        Wanted!