Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Custom Sort An AoA

by Limbic~Region (Chancellor)
on Apr 01, 2014 at 16:04 UTC ( [id://1080594] : perlquestion . print w/replies, xml ) Need Help??

Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

I have an AoA that I need to sort in a way that I can't think of an immediate solution to. Each element will be an anonymous array of a variable number of strings. The first sort criteria is easy as it is based on the number of elements in the array.
@list = sort {@$a <=> @$b} @list;
The secondary sort criteria is where it gets tricky. For elements that are an anonymous array of the same size, I want them to appear in ascending asciibetical order working from the last element to the front.
['blah', 'asdf', 'foo', 'bar'], ['two'], ['zzz', 'def', 'ghi'], ['one'], ['mmm', 'def', 'ghi'], ['qqq', 'xyz', 'aaa']
should end up sorted as
['one'], ['two'], ['qqq', 'xyz', 'aaa'], ['mmm', 'def', 'ghi'], ['zzz', 'def', 'ghi'], ['blah', 'asdf', 'foo', 'bar'],
Any ideas?

Update: In an effort to be more clear, here are the instructions I would give someone to sort them by hand.

  1. First order by the number of elements with fewest first
  2. Break the items into groups based on the number of elements
  3. For each group, order the items based on the last element
  4. To break ties with the last element, consider the 2nd to the last element
  5. To break ties with the nth element, consider the n-1th element
While in this case, I need ASCIIbetical order, I am interested in a more general solution.

Cheers - L~R

Replies are listed 'Best First'.
Re: Custom Sort An AoA
by Corion (Patriarch) on Apr 01, 2014 at 16:10 UTC

    The general trick is to append your additional sort criteria to the sort block.

    @list= sort { @$a <=> @$b || ... secondary criterion ... || ... tertiary criterion ... } @list;

    So all that remains is to find the appropriate expression to find and compare the last elements of each array:

    $a->[-1] cmp $b->[-1] # untested

    Taken together, your sort expression would be

    @list= sort { @$a <=> @$b || $a->[-1] cmp $b->[-1] } @list;
      I must not have done a good job of explaining.

      Start at the last element and if they are equivalent then check the next to the last and if they are equivalent check the next to the next to the last....

      Your solution fails to produce the correct output because you are only comparing the last element.

      Cheers - L~R

Re: Custom Sort An AoA
by kennethk (Abbot) on Apr 01, 2014 at 16:34 UTC
    Seems like this should be straightforward:
    my @list = sort { @$a <=> @$b or join(' ', reverse @$a) cmp join(' ', reverse @$b) } @input;

    Does that fit the bill? Seems like my understanding of the spec is inconsistent with the other implementations, so I don't know if I've missed something.

    If the array operations turn pricey, you could optimize with an Orcish Maneuver.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      This will work and that's what matters. Can you think of a more general purpose solution though (consider if the elements had been numerical and what was desired was numerical order for instance).

      Cheers - L~R

        If all the data were numeric and limited in length, you could zero-pad it before the join and then CMP it since <=> will probably explode on longer arrays.

        If you need it to be fully generic, you could always break down and make the sort block be a complex sub that loops over the elements and returns once it finds a difference.

        sub sortIt { my $result = @$a <=> @$b; my $idx = $#$a; while (!$result && $idx >=0) { $result = $a->[$idx] cmp $b->[$idx]; $idx--; } return $result; }
        Lots of discussion has ensued since I checked out yesterday, but the most generic treatment I can think of is essentially pre-computing the comparison strings in two passes, so that you can be sure that your arguments are properly conditioned:
        #!/usr/bin/perl use strict; use warnings; my @input = ( ['blah', 'asdf', 'foo', 'bar'], ['two'], ['zzz', 'def', 'ghi'], ['one'], ['mmm', 'def', 'ghi'], ['qqq', 'xyz', 'aaa'], ); my @list = do { my $max_array = 0; my $max_word = 0; for (@input) { $max_array = @$_ if @$_ > $max_array; for (@$_) { $max_word = length if length > $max_word; } } my $digit = 1 + int log($max_array)/log(10); my $format = "%$digit.d" . ("%-${max_word}s") x $max_array; my %cache = map {$_ => sprintf $format, 0+@$_, reverse(@$_), ( +'') x $max_array} @input; sort {$cache{$a} cmp $cache{$b}} @input; }; $" = "', '"; print "['@$_']\n" for @list;
        My measurement approach means that it's no longer sensitive to delimiter choice, but of course this is a conservative approach to that since it assumes one $max_word for all terms. The empty string padding in the sprintf is just to silence warnings.

        If you want a numerical sorting for elements in place of lexical sorting, you could swap "%-${max_word}s" in the format constructor to "%${max_word}.d" More complex the pattern, the more complex the construction - you can see how impressively the code exploded for just these changes. You could even support a mixed mode by flopping between "%-${max_word}s" and "%${max_word}.d" depending on looks_like_number EXPR.

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Sounds like that could fail on cases where there is a space in the data.

      But as long as you can find a separator value that does not occur, you should be good.

        True, but I think that most people would feel uncomfortable with join chr(1), ... See Re^3: Custom Sort An AoA for a more rigorous solution, the complexity of which makes my first suggestion feel very inviting.

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Custom Sort An AoA
by davido (Cardinal) on Apr 01, 2014 at 18:30 UTC

    This solution doesn't optimize for minimizing work inside the sort routine, but I think it does optimize for clarity and generality (it would be easy to substitute <=> for cmp, for example):

    use Test::More tests => 1; my @unsorted = ( [ 'blah', 'asdf', 'foo', 'bar' ], ['two'], [ 'zzz', 'def', 'ghi' ], ['one'], [ 'mmm', 'def', 'ghi' ], [ 'qqq', 'xyz', 'aaa' ] ); my @wanted = ( ['one'], ['two'], [ 'qqq', 'xyz', 'aaa' ], [ 'mmm', 'def', 'ghi' ], [ 'zzz', 'def', 'ghi' ], [ 'blah', 'asdf', 'foo', 'bar' ] ); my @sorted = sort { @$a <=> @$b || do { my ( $left, $right ) = ( [@$a], [@$b] ); my $res = 0; $res = pop(@$left) cmp pop(@$right) while @$left && !$res; $res; }; } @unsorted; diag "@{$_}\n" for @sorted; is_deeply( \@sorted, \@wanted, 'Sorted array matches expectation.' );


    1..1 # one # two # qqq xyz aaa # mmm def ghi # zzz def ghi # blah asdf foo bar ok 1 - Sorted array matches expectation.

    Update: Here's a version that doesn't waste time making copies of the elements for the purpose of pop:

    use Test::More tests => 1; my @unsorted = ( [ 'blah', 'asdf', 'foo', 'bar' ], ['two'], [ 'zzz', 'def', 'ghi' ], ['one'], [ 'mmm', 'def', 'ghi' ], [ 'qqq', 'xyz', 'aaa' ] ); my @wanted = ( ['one'], ['two'], [ 'qqq', 'xyz', 'aaa' ], [ 'mmm', 'def', 'ghi' ], [ 'zzz', 'def', 'ghi' ], [ 'blah', 'asdf', 'foo', 'bar' ] ); my @sorted = sort { my $res; @$a <=> @$b || do { $res = $a->[$_] cmp $b->[$_] and return $res for reverse 0 .. $ +#{$a} }; } @unsorted; diag "@{$_}\n" for @sorted; is_deeply( \@sorted, \@wanted, 'Sorted array matches expectation.' );

    Same output as above.


Re: Custom Sort An AoA
by AnomalousMonk (Archbishop) on Apr 01, 2014 at 17:40 UTC

    A decorated solution. Assumes none of the strings in the sub-arrays contain nulls, hence you may not consider this approach 'general'! Should be default-sort fast.

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my @list = ( ['blah', 'asdf', 'foo', 'bar'], ['two'], ['zzz', 'def', 'ghi'], ['one'], ['mmm', 'def', 'ghi'], ['qqq', 'xyz', 'aaa'], ); ;; my @sorted = map undecorate(\@list, $_), sort map decorate(\@list, $_), 0 .. $#list ; dd \@sorted; ;; sub decorate { my ($ar_list, $i) = @_; ;; my $elems = @{ $ar_list->[$i] }; my @smeti = reverse @{ $ar_list->[$i] }; ;; return pack qq{N (Z*)$elems N}, $elems, @smeti, $i } ;; sub undecorate { my ($ar_list, $decoration) = @_; ;; return $ar_list->[ unpack 'x* X[N] N', $decoration ]; } " [ ["one"], ["two"], ["qqq", "xyz", "aaa"], ["mmm", "def", "ghi"], ["zzz", "def", "ghi"], ["blah", "asdf", "foo", "bar"], ]
Re: Custom Sort An AoA
by johngg (Canon) on Apr 01, 2014 at 18:34 UTC

    A kind-of recursive sort that doesn't require decoration.

    use strict; use warnings; use Data::Dumper; my @AoA = ( ['blah', 'asdf', 'foo', 'bar'], ['two'], ['zzz', 'def', 'ghi'], ['one'], ['mmm', 'def', 'ghi'], ['qqq', 'xyz', 'aaa'], ); my @sortedAoA = do { my $recSort; $recSort = sub { my @arrA = @{ $_[ 0 ] }; my @arrB = @{ $_[ 1 ] }; return 0 unless @arrA; return ( pop( @arrA ) cmp pop( @arrB ) ) || $recSort->( \ @arrA, \ @arrB ); }; sort { @$a <=> @$b || $recSort->( $a, $b ) } @AoA; }; print Data::Dumper->Dumpxs( [ \ @sortedAoA ], [ qw{ *sortedAoA } ] );

    The output.

    I hope this is of interest.

    Update: Renamed the @revA and @revB variables to @arrA and @arrB as the final solution used pop rather than reverse and shift but I'd forgotten to rename at the time.



Re: Custom Sort An AoA
by wind (Priest) on Apr 01, 2014 at 20:28 UTC

    A solution using List::MoreUtils.

    Isn't the most efficient since it doesn't shortcut the comparing of array elements. Also duplicates the use of localized $a and $b, but overall I think this is very readable.

    use List::MoreUtils qw(lastval pairwise); my @array = ( ['blah', 'asdf', 'foo', 'bar'], ['two'], ['zzz', 'def', 'ghi'], ['one'], ['mmm', 'def', 'ghi'], ['qqq', 'xyz', 'aaa'], ); @array = sort {@$a <=> @$b or lastval {$_} pairwise {$a cmp $b} @$a, @ +$b} @array; use Data::Dump; dd \@array;
    - Miller
Re: Custom Sort An AoA
by kcott (Archbishop) on Apr 02, 2014 at 04:01 UTC

    G'day Limbic~Region,

    This is a general solution which appears to tick all the boxes:

    @list = sort { @$a <=> @$b || do { my $cmp; ($cmp = $a->[-$_] cmp $b->[-$_]) ? last : next for 1 .. @$a; $cmp; } } @list;

    Here's my test:

    Update: I wrote that this was "a general solution"; however, having reviewed some of your comments in this thread, I see what I've provided wasn't what you meant.

    I've converted the code I originally posted, into what I suspect is closer to what you want; but, if it's not, please provide more details about what you're looking for.

    #!/usr/bin/env perl use strict; use warnings; use Scalar::Util qw{looks_like_number}; my @master_alpha_list = ( ['blah', 'asdf', 'foo', 'bar'], ['two'], ['zzz', 'def', 'ghi'], ['one'], ['mmm', 'def', 'ghi'], ['qqq', 'xyz', 'aaa'] ); my @master_num_list = ( [ 0, 1, 2, 3 ], [ 2 ], [ 9, 3, 6 ], [ 1 ], [ 5, 3, 6 ], [ 7, 8, 1 ], ); my @alpha_asc = sort { @$a <=> @$b || gen_sort() } @master_alpha_list; my @alpha_dsc = sort { @$a <=> @$b || gen_sort(1) } @master_alpha_list +; my @num_asc = sort { @$a <=> @$b || gen_sort() } @master_num_list; my @num_dsc = sort { @$a <=> @$b || gen_sort(1) } @master_num_list; use Data::Dump; dd $_ for (\@alpha_asc, \@alpha_dsc, \@num_asc, \@num_dsc); sub gen_sort { my ($desc) = @_; my $compare = looks_like_number($a->[0]) ? \&cmp_num : \&cmp_alpha +; my ($x, $y) = $desc ? ($b, $a) : ($a, $b); my $cmp; ($cmp = $compare->($x->[-$_], $y->[-$_])) ? last : next for 1 .. @ +$a; return $cmp; } sub cmp_alpha { $_[0] cmp $_[1] } sub cmp_num { $_[0] <=> $_[1] }


    [ ["one"], ["two"], ["qqq", "xyz", "aaa"], ["mmm", "def", "ghi"], ["zzz", "def", "ghi"], ["blah", "asdf", "foo", "bar"], ] [ ["two"], ["one"], ["zzz", "def", "ghi"], ["mmm", "def", "ghi"], ["qqq", "xyz", "aaa"], ["blah", "asdf", "foo", "bar"], ] [[1], [2], [7, 8, 1], [5, 3, 6], [9, 3, 6], [0 .. 3]] [[2], [1], [9, 3, 6], [5, 3, 6], [7, 8, 1], [0 .. 3]]

    -- Ken

Re: Custom Sort An AoA
by Anonymous Monk on Apr 01, 2014 at 16:08 UTC
    decoreate with  [ sprintf '%06d %s', scalar(@$_), something(@$_) ] where something is whatever you mean by ascending asciibetical, maybe simply join