cryptic has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to sort arrays within an array. For eg. say
@array = (['CCI003', '1', 'M'], ['CCI002', '1', 'N'], ['CCI001', '1', 'U'], ['CCI002', '2', 'N'])
then after sorting the array would look like
(['CCI001', '1', 'U'], ['CCI002', '1', 'N'], ['CCI002', '2', 'N'], ['CCI003', '1', 'M'])
The sorting being done on the FIRST element within each sub-array. Here is my test code:
#!/usr/perl5/bin use strict; my @array; my $intElements; my @prefix1; my @prefix2; my @prefix3; my @prefix4; my @sorted; @prefix1 = ('CCI003', '1', 'M'); @prefix2 = ('CCI002', '1', 'N'); @prefix3 = ('CCI001', '1', 'U'); @prefix4 = ('CCI002', '2', 'N'); print STDOUT "Individual arrays: \n"; print STDOUT "@prefix1 \n"; print STDOUT "@prefix2 \n"; print STDOUT "@prefix3 \n"; print STDOUT "@prefix4 \n"; $intElements = push @array, \@prefix1; $intElements = push @array, \@prefix2; $intElements = push @array, \@prefix3; $intElements = push @array, \@prefix4; print STDOUT "Final Array: \n"; foreach my $rec (@array) { print STDOUT "$rec->[0], $rec->[1], $rec->[2] \n"; } @sorted = sort {$$a[0] <=> $$b[0]} @array; print "\n"; print STDOUT "Sorted Array: \n"; foreach my $rec (@sorted) { print STDOUT "$rec->[0], $rec->[1], $rec->[2] \n"; }
It dosen't seem to do anything to the sort order. This is the first time I'm working with array sorting.....so I might be missing something pretty stupid. Any help is appreciated.

Replies are listed 'Best First'.
Re: Sorting multi-dimensional arrays
by Molt (Chaplain) on Sep 11, 2002 at 16:32 UTC

    Your problem is that the <=> operator in the sort is a numeric comparison, and with the strings not beginning with a number they evaluate to a 0.

    Simple solution: Replace the <=> with a cmp

      Except that will do the wrong thing for CC100 vs CC1000..

      Makeshifts last the longest.

        This is true, I was assuming (Big mistake..) that given the leading zeroes that these numbers were all zero padded. If not the following Schwartzian Transform should work- it's just not as simple a fix as the earlier one.

        # Note: Read Schwartzian Transforms from bottom-up, # it makes it easier to understand that way. @sorted = # And finally map back to the original array. # [3,['CCI003','1','N']] -> ['CCI003','1','N']. map { $_->[1] } # Sort by the numeric component we just extracted sort { $a->[0] <=> $b->[0] } # Map onto an array consisting of the numeric component # of the first part. # ['CCI003','1','N'] -> [3,['CCI003','1','N']. map { [ $_->[0]=~/(\d+)/, $_] } # Take the initial array @array;

        Update: Fixed bad formatting.

Re: Sorting multi-dimensional arrays
by Aristotle (Chancellor) on Sep 11, 2002 at 17:23 UTC
    Molt has the problem right, even if his proposed solution is not likely optimal. You will just have to process the elements first so as to make them acceptable numericals. If you always have two alphanumeric characters in front, this may do: @sorted = sort { substr($a->[0], 2) <=> substr($b->[0], 2) } @array; Do be aware of the ubiquitous Schwartzian Transform which can immensely accelerate sorting on expensive to calculate derived keys.

    Makeshifts last the longest.

      Unless he's going to sort a VERY large AoAs, cmp should be plenty fast enough. He probably does not need to go to through setting up an ST or GRT.

      And judging from his example, it looks like his keys are fixed width, so he doesn't have to worry about "CG100" vs "CG1000", for instance.
      --
      Mike

        Even fixed length strings are not really a guarantee; cmp will do the wrong thing for CC100A vs CC1000 too. It's always better to be safe than sorry. Note my code sample did not use a Schwartzian Transform because if the data set is as small as what he showed, it isn't worth the CPU time to set it up; I merely wanted him to be aware of it, should he need such.

        Makeshifts last the longest.

Re: Sorting multi-dimensional arrays
by shotgunefx (Parson) on Sep 11, 2002 at 23:20 UTC
    @sorted = sort {$$a[0] <=> $$b[0] || $$a[0] cmp $$b[0] } @array; prints
    Sorted Array:  
    CCI001, 1, U 
    CCI002, 1, N 
    CCI002, 2, N 
    CCI003, 1, M 
    
    or reverse tests so numbers come before characters.

    -Lee

    "To be civilized is to deny one's nature."
Re: Sorting multi-dimensional arrays
by BrowserUk (Patriarch) on Sep 12, 2002 at 05:05 UTC

    If your set of prefixes if reasonably small, but the format and/or the variations in format make it difficult to map the sort order cleanly, then another possibility is to use a lookup table for your sorting. That way you pre-specify the order you want them in and let sort use that pre-specified order to do its work.

    This isn't a very convincing demonstration, but it should serve as an illustration. It's greatest asset is its flexibility, though on large datasets it should also be pretty efficient.

    #! Perl -sw use strict; my $index = 0; my %lookup = map { $_ => $index++; } qw( CCI001 CCI002 CCI002A CCI002B CCI100 CCI101 CCI500 CDI001 CDJ001 ); my @data = ( [ qw(CDJ001 1 N)], [ qw(CCI100 1 M)], [ qw(CCI001 1 M)], [ qw(CCI002B 1 N)], [ qw(CCI002 1 N)], [ qw(CDI001 1 M)], [ qw(CCI500 1 M)], [ qw(CCI002A 1 N)], [ qw(CCI101 1 N)], ); my @sorted = sort{ $lookup{$$a[0]} <=> $lookup{$$b[0]} } @data; print "@{$_}\n" for @sorted; __DATA__ # Output C:\test>196991 CCI001 1 M CCI002 1 N CCI002A 1 N CCI002B 1 N CCI100 1 M CCI101 1 N CCI500 1 M CDI001 1 M CDJ001 1 N C:\test>

    Well It's better than the Abottoire, but Yorkshire!
Re: Sorting multi-dimensional arrays
by cryptic (Initiate) on Sep 12, 2002 at 18:25 UTC
    Thank you all for the great response. I will try out one or more of the solutions suggested and see which best fits my problem. I'll post to this thread the scenarios I face and the solution which worked for me!