Re: Genesis of a sort routine

Replies are listed 'Best First'.
Re: Genesis of a sort routine by Abigail-II (Bishop) on Nov 06, 2003 at 10:32 UTC
If you want to be faster, it's better to eliminate the sort block, instead of making the block slightly faster. Something like the following ought to work (untested): `my @sorted = map {substr $_ => 1} sort map {/:/ ? "1$_" : "0$_"} @unsorted;` [download] Abigail	[reply] [d/l]
Re: Genesis of a sort routine by Abigail-II (Bishop) on Nov 06, 2003 at 10:59 UTC
Here's a benchmark to backup the claim. Replacing the regex in the map with index gives even a slightly better result, but that effect is only minimal compared to eliminating the sort block (linear vs n log n). #!/usr/bin/perl use strict; use warnings; use Benchmark qw /timethese cmpthese/; my $elements = 1_000; my $colon = 50; our @array = map {$_ = crypt $_, sprintf "%02x" => rand 256; substr $_, int rand length, 1, ":" if $colon < rand +100; $_} 1 .. $elements; our (@green, @dan, @abi1, @abi2); cmpthese -10 => { greenFox => '@green = sort {(($a =~ /:/) <=> ($b =~ /:/)) \|\| $a cm +p $b} @array', '3dan' => '@dan = sort {((index ($a, ":") >= 0) <=> (index ($b, ":") >= 0)) \|\| $a cmp $b} + @array', abigail1 => '@abi1 = map {substr $_ => 1} sort map {/:/ ? "1$_" : "0$_"} @array', abigail2 => '@abi2 = map {substr $_ => 1} sort map {index ($_, ":") >= 0 ? "1$_" : "0$_"} +@array', }; warn '@green != @dan', "\n" if "@green" ne "@dan"; warn '@green != @abi1', "\n" if "@green" ne "@abi1"; warn '@green != @abi2', "\n" if "@green" ne "@abi2"; warn '@dan != @abi1', "\n" if "@dan" ne "@abi1"; warn '@dan != @abi2', "\n" if "@dan" ne "@abi2"; warn '@abi1 != @abi2', "\n" if "@abi1" ne "@abi2"; __END__ Rate greenFox 3dan abigail1 abigail2 greenFox 109/s -- -16% -54% -56% 3dan 130/s 19% -- -45% -47% abigail1 236/s 117% 82% -- -4% abigail2 246/s 126% 89% 4% -- [download] Abigail	[reply] [d/l]
Re: Re: Genesis of a sort routine by Anonymous Monk on Nov 06, 2003 at 14:30 UTC
Adding this entry: `anon => '@anon = (sort(grep!/:/,@array), sort(grep/:/,@array)) +',` [download] To your benchmark code demonstrates a significant speed increase (and the 'grep only once' anonymous version seen previously is slightly faster still): `Rate greenFox 3dan abigail1 abigail2 anon greenFox 26.2/s -- -7% -48% -51% -68% 3dan 28.2/s 8% -- -44% -48% -65% abigail1 50.7/s 94% 80% -- -6% -37% abigail2 53.7/s 105% 91% 6% -- -33% anon 80.7/s 208% 186% 59% 50% --` [download]	[reply] [d/l] [select]
Re: Re: Genesis of a sort routine by BrowserUk (Patriarch) on Nov 06, 2003 at 10:44 UTC
Combining your method with index instead of regex and it goes quicker still. Updated: Right conclusions, wrong evidence. Ignore this the tests are bad. Read more... (2 kB) Updated: I extended the tests without testing the tests! D'oh. Read more... (2 kB)	[reply] [d/l] [select]
Re: Re: Genesis of a sort routine by bart (Canon) on Nov 06, 2003 at 18:38 UTC
As for an attribution: the method applied qualifies as the Rosler-Guttman Transform. See Larry Rolser and Uri Guttman's paper on sorting in Perl and demerphq's summary in "Advanced Sorting - GRT - Guttman Rosler Transform"	[reply]
Re: Re: Genesis of a sort routine by BrowserUk (Patriarch) on Nov 06, 2003 at 09:39 UTC
The two extra comparisons, `>= 0` cost you 10%. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Hooray! Wanted!	[reply] [d/l]
Re: Re: Re: Genesis of a sort routine by edan (Curate) on Nov 06, 2003 at 09:50 UTC
Dunno, in my benchmark (performed after I wrote the node, of course), they seem to come out pretty even, so I'd be likely to go with the 'optimized regex' answer given by the AM. Did you have different results? Maybe my benchmark wasn't very good... -- 3dan	[reply]
Re+: Genesis of a sort routine by BrowserUk (Patriarch) on Nov 06, 2003 at 10:30 UTC
Hmm. I originally used a debug session & timethese(), did a little mental arithmetic, that jelled with what Anonymonk said, so I went with it. Now I've put it into a proper script, used cmpthese() to do the arithmetic, and I can't reproduce my original results. In fact, I seem to consistantly get the index version coming out quicker, sometimes markedly so?? Maybe I did the arithmetic wrong, or there is a flaw in my benchmark? I used random strings with (or without) randomly positioned ':' to try cover the failing conditions as well as a good spread of the passing ones. The spread of variation in timings seem to indicate that some combinations cause the regex engine to take much longer than others. I've not managed to get the difference less than 9% in favour of index, and on one occasion, it went as high as 28%. Further investigation is called for:) #! perl -slw use strict; use Benchmark qw[ cmpthese ]; sub rndStr{ join'', map{ $_[ rand scalar @_ ] } 0 .. shift } our @strings = map{ rndStr 8, ':', 'a' .. 'z' } 1 .. 1000; our( @a, @b ); cmpthese( -3, { regex => q[ our @a = sort{ ( $a =~ /:/ <=> $b =~ /:/ ) \|\| $a cmp $b } @strings ], index => q[ our @b = sort{ ( ( index($a,':') >= 0 ) <=> ( index($b,':') >= 0) ) \|\| $a cmp $b; } @strings ], }); print 'Okay' if "@a" eq "@b"; __END__ P:\test>test3 Rate regex index regex 14.6/s -- -9% index 16.1/s 10% -- Okay P:\test>test3 Rate regex index regex 13.7/s -- -15% index 16.2/s 18% -- Okay P:\test>test3 Rate regex index regex 14.5/s -- -11% index 16.3/s 13% -- Okay P:\test>test3 Rate regex index regex 13.1/s -- -19% index 16.2/s 23% -- Okay [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Hooray! Wanted!	[reply] [d/l]
Re: Re+: Genesis of a sort routine by sauoq (Abbot) on Nov 06, 2003 at 10:58 UTC
Re: Re: Re+: Genesis of a sort routine by BrowserUk (Patriarch) on Nov 06, 2003 at 11:19 UTC
Re: Re: Genesis of a sort routine by Anonymous Monk on Nov 06, 2003 at 09:01 UTC
I think perl's re's are pretty well optimized for fixed string searches, so I would expect minimal gains from using index in this case.	[reply]