BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

Given an array of arrays:............I want to achieve:

@in = { @out = { qw[M N], qw[C B A], qw[M N Q P O], qw[C G F E D], qw[C B A], qw[C G F H], qw[U S], qw[C G F H I], qw[V T], qw[C G F H J], qw[C G F E D], qw[M K], qw[C G F H], qw[M N], qw[C G F H I], qw[M N Q P O], qw[C G F H J], qw[U S], qw[M K], qw[V T], }; };

While these are single chars, I can do this using

@out = sort{ "@$a" cmp "@$b" } @in;

but once the elements can be strings that could contain embedded spaces it will screw up.

I've had a couple of goes at this, but I can't get it right today.


..and remember there are a lot of things monks are supposed to be but lazy is not one of them

Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.

Replies are listed 'Best First'.
Re: Sort problem
by tachyon (Chancellor) on Feb 26, 2003 at 19:18 UTC

    A quick Schwartzian Transform does the trick. Your sample is not an array of arrays BTW. I guess you are tired.

    @in = ( [qw(M N)], [qw(M N Q P O)], [qw(C B A)], [qw(U S)], [qw(V T)], [qw(C G F E D)], [qw(C G F H)], [qw(C G F H I)], [qw(C G F H J)], [qw(M K)], ); my @out = map{$_->[0]} sort{$a->[1] cmp $b->[1]} map{[$_, concat($_ +)]} @in; sub concat { my $ary = shift; $ary = join '', @$ary; $ary =~ s/\s//g; return $ary; } use Data::Dumper; # munge the output for compactness and to make it look nicer $Data::Dumper::Indent = 0; print map {s/\[/\n \[/g; $_} Dumper \@out; __DATA__ $VAR1 = [ ['C','B','A'], ['C','G','F','E','D'], ['C','G','F','H'], ['C','G','F','H','I'], ['C','G','F','H','J'], ['M','K'], ['M','N'], ['M','N','Q','P','O'], ['U','S'], ['V','T']];

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Sort problem
by dws (Chancellor) on Feb 26, 2003 at 19:27 UTC
    What you've shown isn't an Array of Arrays. But let's assume that's a mistranscription. Something like this might do what you need.
    use strict; use Data::Dumper; my @in = ( [qw(M N)], [qw(M N P O)], [qw(C B A)], [qw(U S)], [qw(V T)], [qw(C G F E D)], [qw(C G F H)], [qw(C G F H I)], [qw(C G F H J)], [qw(M K)] ); sub arrayCompare { my @a = @$a; my @b = @$b; while ( 1 ) { return 0 if @a ==0 && @b == 0; return -1 if @a == 0; return 1 if @b == 0; my $cmp = $a[0] cmp $b[0]; return $cmp if $cmp != 0; shift @a; shift @b; } } my @out = sort arrayCompare @in; print Dumper @out;

    (Side comment: I'm surprised at how many people missed the significance of "once the elements can be strings that could contain embedded spaces" and tried approaches based on concatenation.)

      Thanks dws. First for seeing past my clumbsiness:)

      Second catching the significance of embedded spaces.

      Third for showing me where I was going wrong. The nearest I had got was

      print "@$_" for sort{ my ($o, $i) = (0,0); $i++ until ($i < @$a or $i < @$b) and $o =($a->[$i] cmp $b->[$i]) or $o = @$a <=> @$b; $o; }@deps;
      Which gave me

      C B A C G F H C G F E D C G F H I C G F H J M K M N M N Q P O U S V T
      Close, but no cigar. qw[C G F H] was sorting above qw[C G F E D] but for the life of me I couldn't see why.

      Your code made me realise what was wrong and led to this

      print "@$_" for sort{ my ($o, $i) = (0,0); $i++ until ( $i < @$a or $i < @$b ) and $o=( $a->[$i] cmp $b->[$i] ); $o || @$a <=> @$b; }@deps;

      Which I realise won't win any prices in the clarity-at-all-costs stakes, but I find readable.


      Examine what is said, not who speaks.
      1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
      2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
      3) Any sufficiently advanced technology is indistinguishable from magic.
      Arthur C. Clarke.

        Why doesn't this work out of interest? Internal spaces are retained but the concat does not add spaces....

        { local $" = ''; @out = sort { "@$a" cmp "@$b" } @in; }

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        I think the new code you have above gives 'use of uninitialized value' warnings because the array-length check is not quite right. I think it needs instead:

        $i++ until $i >= @$a or $i >= @$b or $o = ( $a->[$i] cmp $b->[$i] );

        $i == @$a would be just as good as $i >= @$a; I can't decide which is clearer.

        Hugo

      Just concatenate and remove the spaces and throw in a Schwartzian Yransform for efficiency....like this

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        Just concatenate and remove the spaces and ...

        You and I have apparently come to quite different understandings of the problem BrowserUK posed. He left a few details out, but I assume that he's illustrating the shape of the data, and is providing a valuable clue to the actual content of the data when he writes "once the elements can be strings that could contain embedded spaces".

        If my assumption is true, then approaches that use concatenation will fail for some inputs, notably some inputs that contain embedded spaces.

        Perhaps this is a good opportunity for BrowserUK to clarify his intent.

•Re: Sort problem
by merlyn (Sage) on Feb 26, 2003 at 19:16 UTC
    @in = { @out = { qw[M N], qw[C B A], qw[M N Q P O], qw[C G F E D], qw[C B A], qw[C G F H], qw[U S], qw[C G F H I], qw[V T], qw[C G F H J], qw[C G F E D], qw[M K], qw[C G F H], qw[M N], qw[C G F H I], qw[M N Q P O], qw[C G F H J], qw[U S], qw[M K], qw[V T], }; };
    Well, first off, that's no where near an array of arrays. You've got an array taking a single element which is a hashref pointing to an anonymous hash, constructed by taking alternating pairs of values, like:
    @in = ({ M => 'N', M => 'N', Q => 'P', O => 'C', ... });
    So, if you started with that, you definitely need to figure out how to write your original problem first.

    I could recommend perllol or perldsc as starting points.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      You missed this opportunity so I had to post it for you here We know what BrowserUK meant and as he posts quality answers and interesting questions I reckon HWIWHM is the go rather than giving him a RTFM.

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Your right of course. I should have taken more care when writing up my question. I printed the output from the AoA and then wrapped it up to suggest the structure but made a mess of it.


      ..and remember there are a lot of things monks are supposed to be but lazy is not one of them

      Examine what is said, not who speaks.
      1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
      2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
      3) Any sufficiently advanced technology is indistinguishable from magic.
      Arthur C. Clarke.
Re: Sort problem
by Hofmator (Curate) on Feb 26, 2003 at 19:13 UTC
    Depending on what your data can contain, you could do something like that:
    { local $" = $;; @out = sort {"@$a" cmp "@$b"} @in; }
    which uses the value of the subscript separator for multiple hashes, by default "\034" which occurs rather seldom in normal text. However this doens't work if your data can contain anything.

    -- Hofmator

Re: Sort problem
by Fletch (Bishop) on Feb 26, 2003 at 19:02 UTC

    Not much help, but I want to mumble something about this looking like a radix sort. Consult your friendly neighborhood algorithms book for possibly more enlightenment.

    Update: Frell, you people are harsh. :) From MAwP, the example in question is ch04/radix-sort and you can grab the sample code tarball.

    Update redux: And if you want that to deal with more-than-one-letter per slot you'll probably have to use something like below rather than just calling ord().

    my $val = 0; $val = $val * 10 + substr( $cur, -1, 1, undef ) while $cur;
Re: Sort problem
by Thelonius (Priest) on Feb 26, 2003 at 19:14 UTC
    As long as you know that there won't be embedded null characters, you can do this:
    { local $" = "\0"; @out = sort {"@$a" cmp "@$b" } @in; }
      You and previous poster are almost there. But instead of setting $" to a byte that's not expected to be present, set it to the empty string:
      { local $" = ""; @out = sort {"@$a" cmp "@$b"} @in; }

      You can combine that with a GRT or an ST.

      Abigail

Re: Sort problem
by Thelonius (Priest) on Feb 26, 2003 at 19:27 UTC
    Here's a more general solution:
    sub arrarycomp { my ($x, $y) = @_; my $arrmin = $#$x; $arrmin = $#$y if $#$y < $arrmin; my $result; for (0 .. $arrmin) { if ($result = $$x[$_] cmp $$y[$_]) { return $result; } } return $#$x - $#$y; } @out = sort {arrarycomp($a, $b)} @in;
    I am assuming that for this array:
    my @in = ( [ "The Cat", "G", "F", "H", "J" ], [ "The", "Dog", "G", "F", "H", "J",], );
    You want this output Dumper(@out):
    $VAR1 = [ 'The', 'Dog', 'G', 'F', 'H', 'J' ]; $VAR2 = [ 'The Cat', 'G', 'F', 'H', 'J' ];
Re: Sort problem
by jsprat (Curate) on Feb 26, 2003 at 19:53 UTC
    Here's my stab at it. Note that this will sort properly for longer strings (ie ['MN', 'QPO'] and qw[M N Q P O]) as well as strings with embedded spaces.

    use strict; use Data::Dumper; my @in = ( [qw[M N]], [qw[M N Q P O]], ['MN', 'QPO'], [qw[C B A]], ['C ', 'A', 'A], [qw[U S]], [qw[V T]], [qw[C G F E D]], [qw[C G F H]], [qw[C G F H I]], [qw[C G F H J]], [qw[M K]], ); sub arrayrefs { my $lindex = $#$a < $#$b ? $#$a : $#$b; for (0 .. $lindex) { return $$a[$_] cmp $$b[$_] unless $$a[$_] eq $$b[$_]; } #equal or less return $#$a <=> $#$b; } my @out = sort arrayrefs @in; print Dumper @out;

    Update: Added a an element with a space and added the element mentioned earlier ['MN', 'QPO']