in reply to Re^5: How can I do a numeric sort on a substring? (context matters)
in thread How can I do a numeric sort on a substring?
Out of curiosity I added some subs using Sort::Key. sort_key_natural is the natsort function from Sort::Key::Natural while sort_key_integer uses the ikeysort function from Sort::Key in tandem with substr.
Perl & OS: v5.28.2 on MSWin32 Unordered data (for preamble tests): a-10 a-01 a-22 a-2 a-0 a-3 a-000 a-1 a-12345 a-1 Preamble tests: grt_pack_expr: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 grt_pack_expr_q: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 sort_key_integer: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 sort_key_natural: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 st_regex: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 st_regex_anchored: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 st_regex_anch_expr_ni: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 st_regex_anch_ni: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 st_regex_expr_ni: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 st_regex_no_index: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 map_cat_substr: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 map_cat_substr_len: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 sort_pack: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 sort_regex: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 sort_regex_anchored: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 sort_substr: a-0 a-000 a-01 a-1 a-1 a-2 a-3 a-10 a-22 a- +12345 Legend: GRTpe: grt_pack_expr GRTpeq: grt_pack_expr_q SKi: sort_key_integer SKn: sort_key_natural STr: st_regex STra: st_regex_anchored STraen: st_regex_anch_expr_ni STran: st_regex_anch_ni STren: st_regex_expr_ni STrn: st_regex_no_index mcs: map_cat_substr mcsl: map_cat_substr_len sp: sort_pack sr: sort_regex sra: sort_regex_anchored ss: sort_substr Benchmarks: Note: Unordered data extended with 'map "a-$_", shuffle 0..10000' Rate sra sr SKn sp STraen STr STrn STra STran STren + ss GRTpe GRTpeq mcsl mcs SKi sra 4.59/s -- -0% -59% -68% -80% -81% -82% -82% -82% -83% +-84% -91% -91% -92% -93% -95% sr 4.59/s 0% -- -59% -68% -80% -81% -82% -82% -82% -83% +-84% -91% -91% -92% -93% -95% SKn 11.3/s 147% 147% -- -22% -52% -53% -55% -55% -56% -57% +-61% -77% -78% -81% -82% -87% sp 14.5/s 217% 217% 28% -- -38% -40% -42% -43% -43% -45% +-49% -71% -71% -76% -77% -84% STraen 23.5/s 411% 411% 107% 61% -- -3% -6% -7% -8% -12% +-18% -53% -54% -61% -63% -74% STr 24.3/s 429% 429% 115% 67% 4% -- -3% -4% -5% -8% +-15% -52% -52% -59% -62% -73% STrn 25.0/s 445% 444% 121% 72% 7% 3% -- -1% -2% -6% +-13% -50% -51% -58% -61% -72% STra 25.3/s 452% 451% 124% 74% 8% 4% 1% -- -1% -5% +-12% -50% -50% -58% -60% -72% STran 25.5/s 455% 455% 125% 75% 9% 5% 2% 1% -- -4% +-11% -49% -50% -57% -60% -72% STren 26.6/s 478% 478% 134% 83% 13% 9% 6% 5% 4% -- + -8% -47% -47% -56% -59% -70% ss 28.7/s 525% 525% 153% 97% 22% 18% 15% 13% 13% 8% + -- -43% -43% -52% -55% -68% GRTpe 50.2/s 993% 992% 343% 245% 114% 106% 101% 98% 97% 89% + 75% -- -1% -16% -22% -44% GRTpeq 50.5/s 1000% 1000% 346% 248% 115% 108% 102% 100% 98% 90% + 76% 1% -- -15% -21% -44% mcsl 59.8/s 1202% 1201% 428% 311% 155% 146% 139% 136% 135% 125% +108% 19% 18% -- -7% -33% mcs 64.0/s 1293% 1293% 465% 340% 173% 163% 156% 153% 151% 141% +123% 28% 27% 7% -- -29% SKi 89.5/s 1849% 1849% 690% 516% 281% 268% 258% 253% 251% 237% +212% 78% 77% 50% 40% --
The natsort approach is not particularly fast, but this is perhaps to be expected given it is a general purpose function (as are the unanchored regex approaches). I guess the integer key approach is faster as it takes advantage of direct string operations when building the keys, and then whatever optimisations Sort::Key uses internally.
I assume the differences in the order of the other approaches compared with Lanx's is due to the code being run on Strawberry perl 5.28. It would be interesting to know how the Sort::Key approaches go under a more recent Perl.
Edit: And now I look at the source code for Sort::Key::Natural, it is uses a regex approach to divide the string and pad out the numeric sections, so it is not surprising that it is slower than the other regex based approaches here. https://metacpan.org/dist/Sort-Key/source/lib/Sort/Key/Natural.pm#L34.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^7: How can I do a numeric sort on a substring? (context matters)
by salva (Canon) on Jun 28, 2021 at 09:15 UTC | |
by swl (Prior) on Jun 28, 2021 at 09:45 UTC | |
|
Re^7: How can I do a numeric sort on a substring? (context matters)
by kcott (Archbishop) on Jun 28, 2021 at 08:36 UTC |