in reply to Re^3: How can I do a numeric sort on a substring? [Benchmark]
in thread How can I do a numeric sort on a substring?
> I added for the express purpose of heading off such feedback
Oops, sorry I missed that. :/
I can't dig deeper now, but some suggestions, to solve the miracle
Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: How can I do a numeric sort on a substring? (context matters)
by LanX (Saint) on Jun 27, 2021 at 13:46 UTC | |
my suspicion was justified, the benchmarks are in void context, that's why simple sorts are just doing nothing. ( and nothing is fast ;) I took your code and forced all subs to operate in list context, by prepending @ordered = in the first line. That's the result with 10000 elements (you can also adjust $max for more or less elements)
here the code
Cheers Rolf | [reply] [d/l] [select] |
by kcott (Archbishop) on Jun 27, 2021 at 23:22 UTC | |
++ Many thanks for tracking down the problem. Much appreciated. The results are now more in line with what I would have expected. I see that Perl's string handling function, substr, outstrips the regex solutions: I have been recommending, for a very long time, that string functions be chosen over regexes (where they provide equivalent functionality). I should probably add some ST routines (e.g. STss, STmcs) to see how they fare; for instance, would GRTpe be faster than STss. I'm currently at $work, so I can't do that now; I'll look into it this evening (i.e. ~8-10hrs hence). — Ken | [reply] [d/l] [select] |
by kcott (Archbishop) on Jun 28, 2021 at 08:21 UTC | |
I wrapped all of the routines in @{[...]} to provide the list context; that was what I'd used in the preamble tests. I added an STss as I had indicated this morning. I decided that STmcs was going to be pretty much the same as STss, so I skipped that one. I did add an mcse which was mcs with map BLOCK replaced by map EXPR.
I saw ++swl's post. There wasn't any code there, so I guessed.
I ran the benchmark several times; there were no major differences between runs. Here's a sample output, in the spoiler; it's getting very wide (18 subroutines now) and this post is "Re^7", so probably best viewed via the "download" link. And here's the code: — Ken | [reply] [d/l] [select] |
by swl (Prior) on Jun 28, 2021 at 09:49 UTC | |
by swl (Prior) on Jun 28, 2021 at 04:00 UTC | |
Out of curiosity I added some subs using Sort::Key. sort_key_natural is the natsort function from Sort::Key::Natural while sort_key_integer uses the ikeysort function from Sort::Key in tandem with substr. <Reveal this spoiler or all in this thread>
The natsort approach is not particularly fast, but this is perhaps to be expected given it is a general purpose function (as are the unanchored regex approaches). I guess the integer key approach is faster as it takes advantage of direct string operations when building the keys, and then whatever optimisations Sort::Key uses internally. I assume the differences in the order of the other approaches compared with Lanx's is due to the code being run on Strawberry perl 5.28. It would be interesting to know how the Sort::Key approaches go under a more recent Perl. Edit: And now I look at the source code for Sort::Key::Natural, it is uses a regex approach to divide the string and pad out the numeric sections, so it is not surprising that it is slower than the other regex based approaches here. https://metacpan.org/dist/Sort-Key/source/lib/Sort/Key/Natural.pm#L34. | [reply] [d/l] [select] |
by salva (Canon) on Jun 28, 2021 at 09:15 UTC | |
For instance, it can handle arbitrarily large numbers or Unicode. | [reply] |
by swl (Prior) on Jun 28, 2021 at 09:45 UTC | |
by kcott (Archbishop) on Jun 28, 2021 at 08:36 UTC | |
G'day swl, See "Re^7: How can I do a numeric sort on a substring? [Benchmark: reworked and extended]". I've added sort_key_integer and sort_key_natural (made some guesses about the code) as well as a couple of additions of my own. "It would be interesting to know how the Sort::Key approaches go under a more recent Perl." I'm not seeing a huge difference between your output and my latest. SKn is at the slow end of the spectrum; SKi is by far the fastest (substantially faster on my system with Perl 5.34.0). — Ken | [reply] [d/l] [select] |