in reply to Re^3: Sort mechanics problems ...
in thread Sort mechanics problems with objects and potentially contradicting comparisons (would cause infinite loop)

Sometimes you have to be careful what you wish for. The following test code should produce an infinite loop (because there is no documented alternative):
#!/usr/bin/perl use strict; use warnings; my $iteration = 0; print sort byStupid (qw(A B C)); sub byStupid { $iteration++; $iteration > 10000 and die "10000 is a lot to sort three things"; if ($b eq 'C' and $a eq 'A') { $b cmp $a; } else { $a cmp $b; } }
Unfortunately perl sort is content to produce incorrect output in finite time. i.e. it chose to ignore the correct positioning of B and C in the list and produced CAB as output. The correct behaviour would be an infinite loop or a fatal error because the array in any order is an incorrectly returned result from function 'sort'.

One world, one people

Replies are listed 'Best First'.
Re^5: Sort mechanics problems ... (efficiency)
by tye (Sage) on Jun 02, 2016 at 07:32 UTC

    No.

    As documented, the results are "not well-defined". It is far better for a sort algorithm to not go into an infinite loop. Sort algorithms are heavily optimized (for good reasons) to do as few comparisons as possible. Those optimizations are predicated on the assumption that you actually have a well-ordered set with a well-defined comparison operation. Because that is what people generally have when they call an optimized 'sort'.

    The good sorting algorithms are so optimized that they are actually unable to even notice that your comparison operation is not well-defined. If a 'sort' sees $a < $b, then it would be a waste of resources if it later needed to check if $b < $a. So efficient sorting algorithms never do that. If they even see $a < $b and $b < $c, then they also won't bother comparing $a to $c.

    If you want some kind of 'sort' that is guaranteed to detect the well-definedness of the comparison, then that 'sort' will always be O(n*n), no matter the input.

    But even if you had one of those, having one that can go into an infinite loop would just be stupid. That only makes even a little sense if checking $a < $b a second time might give you a different answer than the first time. And if that really matters to you, then your 'sort' can never ever return, because just because $a < $b the first 10,000 times you checked, it might not stay that way.

    - tye        

Re^5: Sort mechanics problems ...
by haukex (Archbishop) on Jun 02, 2016 at 08:29 UTC

    Thank you for the clarification! It was unclear to me at first if maybe it was some kind of infinite recursion problem, but I see now what you mean. As tye has already explained, the answer is clear: a comparator like that means the result of the sort is not well-defined.

    Although I think in this case a solution other than sort is more appropriate, just to play out the idea in theory: If you wanted to sort by changing criteria, including criteria based on the position of each item in the list, then something resembling an iterative approach might be possible: 1. calculate and cache the sort criteria, 2. run the sort, 3. check if the ordering is appropriate and if not go back to step 1. That way, you're not violating the requirements of sort (the criteria remain stable during the entire run of sort), although you do have the problem that the algorithm might not converge overall.

    Regards,
    -- Hauke D

      For the actual requirement of DB load I can live with what perl sort does: it assumes that the sort is well-defined. And it is, provided the required insertion order actually exists. If there is no solution for the comparison rules, perl sort will break one or more of these rules and deliver anyway a permutation of the list. That permutation will not be a solution to an insoluble schedule and the database will complain. That's fine if I have a single transaction for the entire load and can roll back, which is not difficult to arrange, using DBI.

      One world, one people