Re: sort of misunderstanding of sort
by hv (Prior) on Mar 21, 2023 at 11:35 UTC
|
Generally, sort functions are designed to do sorting as efficiently as possible, typically in O(n log(n)) comparisons. I'm not aware of any way to force perl's sort to do all n(n - 1) / 2 comparisons: if the purpose is not sorting, best not use sort() to do it.
The results from the comparator are intended to describe an ordering on the elements: compare(A, B) < 0 asserts that A < B, etc. There is an expectation of consistency here, so we expect identities like these to hold:
A = A
A = B <=> B = A
A < B <=> B > A
A <= B <=> B >= A
(A < B) & (B < C) => A < C
(A <= B) & (B <= C) => A <= C
Note this important paragraph skulking near the end of perldoc -f sort:
The comparison function is required to behave. If it retur
+ns
inconsistent results (sometimes saying $x[1] is less than
+$x[2]
and sometimes saying the opposite, for example) the result
+s are
not well-defined.
For a bit more about what can go wrong, see the accepted response on this related C++ question: What will std::sort do if the comparison is inconsistent?. I'm not aware that perl's implementation is at risk of crashing, but you are certainly entitled to get weird results. | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: sort of misunderstanding of sort
by LanX (Saint) on Mar 21, 2023 at 11:38 UTC
|
I assume you want the cross-product of a list with itself.
sort can't do this because it's designed to be fast with O(n*log(n))
There used to be a way to change the applied algorithm, but that's deprecated now https://perldoc.perl.org/sort
| [reply] [Watch: Dir/Any] [d/l] |
|
>
I assume you want the cross-product
DB<34> sub cross (&$$) { my ($code,$list1,$list2) = @_; local ($a,$b
+); my @res; for $a (@$list1) { for $b (@$list2) { push @res, &$code }
+}; return @res; }
DB<35> x cross { $a + $b } [0..2], [10,20,30]
0 10
1 20
2 30
3 11
4 21
5 31
6 12
7 22
8 32
DB<36>
you can just let $list2 default to $list1 if undef to have a self-operation with only one list.
$list2 //= $list1
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: sort of misunderstanding of sort
by ikegami (Patriarch) on Mar 21, 2023 at 21:19 UTC
|
Merge Sort performs a constant number of comparison for an input of a given size. (There might be some variations in practice since the smaller subdivisions are usually sorted using something other than merge sort.)
Returning garbage values doesn't not increase the number of comparisons; it just cause the the output to be garbage.
If you wanted to increase the number of comparisons, use an older version of Perl that supports using Quicksort. See the sort pragma.
| [reply] [Watch: Dir/Any] |
|
Thanks ikegami and all who answered,
while now I understand better the matter of sorting, your sentence
> Returning garbage values doesn't not increase the number of comparisons
does not seems to be confirmed by my last example: perl -le "@a = sort{ print qq($a $b); -1 }@ARGV" 1 2 3 4
Infact returning garbage ie: -1 seems to produce always an iteration more in respect of returning 0 or 1
If I understand the mergesort, the above example, should go like:
1 2 3 4
[1 2] [3 4]
[1 3] [3 2] [2 4]
So only [1 4] is skipped because derived by previous [1 2] [2 4] To notice also that returning -1 as garbage returns a weird sorted list 1 3 2 4 while returning always 1 produce (as I expected) a reversed list: 4 3 2 1
As side effect hacking sort can lead to a new shuffle :)
# naive shuffle
perl -le "print for sort { int rand(3) - 1 } @ARGV" 1 2 3 4 5
1
3
5
2
4
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
I did mention there are optimizations at play over a pure merge sort.
In fact, my testing shows that passing n sorted items, it only performs n comparisons. This is clearly not the behaviour of a pure merge sort.
But I was indeed wrong about the amount being constant.
- Merging 1,2,3 with 4,5,6 requires three comparisons.
- Merging 1,7,8 with 4,5,6 requires four comparisons.
| [reply] [Watch: Dir/Any] |
|
|
use strict;
use warnings;
use Benchmark qw(cmpthese);
use List::Util qw(shuffle);
my $count = 10000;
my @numbers = 0..10000;
cmpthese($count, {
'List::Util::shuffle' => sub { my @shuffled = shuffle @numbers },
'Discipulus::shuffle' => sub { my @shuffled = sort { int rand(3) -
+ 1 } @numbers },
'Discipulus::--$|fle' => sub { my @shuffled = sort { --$| } @numbe
+rs },
});
__END__
Rate Discipulus::--$|fle Discipulus::shuffle Lis
+t::Util::shuffle
Discipulus::--$|fle 164/s -- -52%
+ -95%
Discipulus::shuffle 342/s 108% --
+ -89%
List::Util::shuffle 3107/s 1793% 809%
+ --
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Re: sort of misunderstanding of sort
by eyepopslikeamosquito (Archbishop) on Mar 22, 2023 at 01:10 UTC
|
My recent experiences during Rosetta Code: Long List is Long taught me that vanilla sorting in Perl tends to be slow.
Even Perl experts were surprised.
I was similarly shocked when changing from one sort to two (sort { $href->{$b} <=> $href->{$a} || $a cmp $b } keys %{$href}
to sort { $href->{$b} <=> $href->{$a} } sort keys %{$href}) gave a massive performance boost ...
and nearly fell off my chair when I later accidentally out-mario-royed marioroy via GRT. :)
I also learnt that stable sorting algorithms tend to run significantly slower than unstable ones in practice.
Yet I don't see Perl supporting unstable sorting algorithms in the near future given:
From sort - perl pragma to control sort() behaviour "The sort pragma is now a no-op, and its use is discouraged ... The default sort has been stable since v5.8.0, and given this consistent behaviour for almost two decades, everyone has come to assume stability"
The other huge sorting performance boost (pun intended) I remember was here when
Boost's sort::block_indirect_sort parallel sort algorithm,
running on 8 threads simultaneously, ran massively faster than single-threaded vanilla C++ std::sort.
I would be interested to learn of future plans to improve Perl sorting performance.
See also my mandatory list of Sorting References.
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: sort of misunderstanding of sort
by bliako (Monsignor) on Mar 21, 2023 at 12:56 UTC
|
one iteration more with -1
basically it's like a logical puzzle you have to solve with asking a *minimum* number of questions (like "is 1 less than 2?") and (ideally) employing heuristics and the identities that hv mentions. In fact Perl's sort does a good job: perl -le 'my $iters;@a=sort{ $iters++; $a<=>$b }map { int rand 50000 }(1..500); print $iters' 3852 whereas an exhaustive sort would ask 500*499/2=124750 questions. So Do What I Mean and Don't @&*%$ Ask Me More Questions Than Necessary hehe
Here is another way to irritate Perl's sort: perl -le 'my$iters;@a=sort{ $iters++;print qq($a $b); [-1,0,1]->[int rand 3] }map { int rand 50000 }(1..500); print $iters'
Edit: see Acme::ICan for another kind of sort | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: sort of misunderstanding of sort
by ikegami (Patriarch) on Mar 21, 2023 at 20:57 UTC
|
[Woops, this doesn't answer the question.]
Checking every single pairing would be wasteful. If we know that a < b and that b < c, then we know that a < c. No need to check that one.
There are algorithms that effectively check every pairing in some circumstances, including Bubble Sort and so-called Quicksort.
Perl used to use Quicksort because it can be fast outside of the degenerate cases. But the degenerate case happens to be an already-sorted list, which is a common thing to sort.
Perl now uses Merge Sort, a much better algorithm. It always performs n log2 n comparisons for an input of size n, which is far less than the n2 approach of checking every pairing. (And it's also stable, meaning that items that compare equally will maintain their relative order. This is a very useful property.)
I linked to the Wikipedia pages of algorithms I mentioned, and they are great. They even include useful animations of the sorting algorithm in action.
| [reply] [Watch: Dir/Any] |
Re: sort of misunderstanding of sort
by Anonymous Monk on Mar 21, 2023 at 12:20 UTC
|
I think values which end to be adjacent in sorted list are guaranteed to be compared, and so your technique works OK. Usual spaceship operator, as well as through-away result with map sort... or 1 for sort... will do, but are longer to type. And in fact -- don't push and then sort again, but update current minimum so far, as LanX did in his question.
| [reply] [Watch: Dir/Any] [d/l] [select] |