Re^7: schwartzian transform problem

Replies are listed 'Best First'.
Re^8: schwartzian transform problem by ikegami (Patriarch) on Mar 25, 2025 at 23:02 UTC
You forgot to include the solution I provided! And let's include a basic use of `sort` to see what that looks like and how the others compare to that. `Rate basic GRT ST keysort basic 1.66/s -- -88% -89% -91% GRT 13.3/s 701% -- -12% -25% ST 15.1/s 810% 14% -- -15% keysort 17.7/s 972% 34% 18% -- Rate basic GRT ST keysort basic 1.79/s -- -87% -89% -91% GRT 13.5/s 654% -- -15% -30% ST 15.8/s 786% 18% -- -18% keysort 19.3/s 980% 43% 22% -- Rate basic GRT ST keysort basic 1.70/s -- -87% -89% -91% GRT 13.3/s 681% -- -15% -27% ST 15.6/s 817% 17% -- -14% keysort 18.3/s 972% 37% 17% --` [download] Sort::Key isn't only the cleanest and simplest of all the solutions (including the builtin `sort`), it's the fastest! It's 17-22% faster than the next fastest. #!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); use File::Slurper qw( read_text ); use Sort::Key qw( rikeysort ); my @unsorted = split /^(?=>>> )/m, read_text( "try3.txt" ); @unsorted = ( @unsorted ) x 10_000; # 90_000 lines (30_000 records) sub basic { my @sorted = sort { my ( $an ) = $a =~ /(\d+)%/; my ( $bn ) = $b =~ /(\d+)%/; $bn <=> $an } @unsorted; } sub ST { my @sorted = map $_->[0], sort { $b->[1] <=> $a->[1] } map [ $_, /(\d+)%/ ], @unsorted; } sub GRT { my @sorted = map substr( $_, 4 ), sort map { /(\d+)%/ ? ( ~ pack( "N", $1 ) . $_ ) : () } @unsorted; } sub keysort { my @sorted = rikeysort { ( /(\d+)%/ )[0] } @unsorted; } cmpthese( -3, { basic => \&basic, ST => \&ST, GRT => \&GRT, keysort => \&keysort, } ); [download] Note: I found `substr( $_, 4 )` to be slightly faster than `unpack( "xa4", $_ )`, thus the change in GRT.	[reply] [d/l] [select]
Re^9: schwartzian transform problem by Anonymous Monk on Mar 26, 2025 at 22:21 UTC
Regexes ARE slow (of course everyone here knows). To the extent I can trust my Strawberries and/or my crippled Kaby Lake to benchmark anything, the next is twice as fast (+ I guess `rnkeysort` should have been used (and it's slower)): `sub keysort1 { my @sorted = rnkeysort { substr $_, rindex( $_, '%' ) - 3, 3 } @uns +orted; }` [download] And assuming `$s` keeps the (un\|pre)split input, and if each 2nd '%' is guaranteed to be an anchor, even if records have different length/layout, then the next unclean ugly one is still faster than `keysort` for me: `sub test1 { my $i = my $j = 0; my @nums; $j ^= 1 or push @nums, substr $s, $i - 3, 3 while -1 != ( $i = index $s, '%', $i + 1 ); my @sorted = @unsorted [ sort { $nums[$b] <=> $nums[$a] } 0 .. $#nums ] } __END__ Rate keysort test1 keysort 21.6/s -- -30% test1 30.6/s 42% --` [download]	[reply] [d/l] [select]