Emanuel has asked for the wisdom of the Perl Monks concerning the following question:

I've been searching perlmonks for pretty long now, also tried different resources on the web, but haven't found anything so far. What I'm doing: i'm building an array with all possible combinations of a word list (code of combination sub found on perlmonks). i then get the array @combos back, which i need sorted by value length of the array. Getting it sorted by value is not the problem, but by length of the value of each entry.
my @array; push @array, "one"; push @array, "two"; push @array, "three"; psuh @array, "four"; my $k=0; my @combos; for(combinations(@array)) { $combos[$k] = join("|",@$_); $k++; } sub combinations { return [] unless @_; my $first = shift; my @rest = combinations(@_); return @rest, map { [$first, @$_] } @rest; }
this returns the following array:
$VAR1 = [ '', 'four', 'three', 'three|four', 'two', 'two|four', 'two|three', 'two|three|four', 'one', 'one|four', 'one|three', 'one|three|four', 'one|two', 'one|two|four', 'one|two|three', 'one|two|three|four' ];
but what I need is to sort the array, so the return value is something like:
$VAR1 = [ 'one|two|three|four' 'two|three|four', 'one|three|four', 'one|two|three', 'one|two|four', 'three|four', 'two|three', 'one|three', 'two|four', 'one|four', 'one|two', 'three', 'four', 'two', 'one', '' ];
any help regarding this would be greatly appreciated.


Emanuel

Replies are listed 'Best First'.
Re: Sort Array by length of Value
by DrHyde (Prior) on Oct 06, 2003 at 14:34 UTC
    There is an example in the documentation for the sort function of how to sort case insensitively. Adapting it to sort by length is trivial. You will need to use the length() function and the numeric comparison "spaceship" operator, <=>.
      too simple, i didn't see it...

      @combos = reverse sort {length(uc($a)) <=> length(uc($b))} @combos;

      sigh, thanks for pointing it out, i guess i was searching way too far.

      regards,

      Emanuel
        You don't need the reverse if you compare reversed values to begin with.
        @b = sort { length $b <=> length $a } @a;
        But then I also wonder if the order of the values _after_ sorting on length might be significant.   You might want to consider something like this:
        @b = sort { length $b <=> length $a || $a cmp $b } @a;
      E.g.,

      @arr = sort { length $b <=> length $a } @arr;

Re: Sort Array by length of Value
by jmcnamara (Monsignor) on Oct 06, 2003 at 14:41 UTC

    After you create @combos you can sort it as follows:
    @combos = sort {length $b <=> length $a} @combos;

    --
    John.

Re: Sort Array by length of Value
by flounder99 (Friar) on Oct 06, 2003 at 17:07 UTC
    How can there be a node about sorting without mentioning the Schwartzian Transform? I know it is overkill in this case.
    use strict; use Data::Dumper; my @array = ( 'four', 'three', 'three|four', 'two', 'two|four', 'two|three', 'two|three|four', 'one', 'one|four', 'one|three', 'one|three|four', 'one|two', 'one|two|four', 'one|two|three', 'one|two|three|four' ); @array = map {$_->[1]} sort {$b->[0] <=> $a->[0]} map {[length($_), $_]} @array; print Dumper(\@array);
    outputs:
    $VAR1 = [ 'one|two|three|four', 'two|three|four', 'one|three|four', 'one|two|three', 'one|two|four', 'three|four', 'two|three', 'one|three', 'two|four', 'one|four', 'one|two', 'three', 'four', 'two', 'one' ];

    --

    flounder

      Out of interest, I have constructed the following test to compare the performance of various solutions:

      use strict; use Benchmark qw(timethese cmpthese); my @array = ( 'four', 'three', 'three|four', 'two', 'two|four', 'two|three', 'two|three|four', 'one', 'one|four', 'one|three', 'one|three|four', 'one|two', 'one|two|four', 'one|two|three', 'one|two|three|four' ); timethese( 10000, {'Schwartzian' => '&Schwartzian()', 'Orcish' => '&Orcish()', 'Direct Sort' => '&DirectSort()', } ); cmpthese( 10000, {'Schwartzian' => '&Schwartzian()', 'Orcish' => '&Orcish()', 'Direct Sort' => '&DirectSort()', } ); sub Schwartzian() { my @sorted_array = map {$_->[1]} sort {$b->[0] <=> $a->[0]} map {[length($_), $_]} @array; } sub Orcish() { my %m; my @sorted_array = sort { ($m{$b} ||= length($b)) <=> ($m{$a} ||= length($a)) } @array; } sub DirectSort() { my @sorted_array = sort {length $b <=> length $a} @array; }
      And the result is as follows on a 1.8GHz Pentium 4:

      Benchmark: timing 100000 iterations of Direct Sort, Orcish, Schwartzia +n... Direct Sort: 4 wallclock secs ( 4.00 usr + 0.02 sys = 4.02 CPU) @ 2 +4900.40/s Orcish: 10 wallclock secs ( 9.47 usr + 0.00 sys = 9.47 CPU) @ 10 +560.78/s Schwartzian: 13 wallclock secs (13.77 usr + 0.00 sys = 13.77 CPU) @ 7 +264.80/s Rate Schwartzian Orcish Direct Sort Schwartzian 7272/s -- -31% -71% Orcish 10561/s 45% -- -58% Direct Sort 24907/s 242% 136% --
      It turns out that the Schwartzian sort is the slowest of all. (The Orcish Maneuver was put in out of curiosity to see how expensive is the length function.)

      From the result I have deduced a couple of interesting results:

      1) Orcish Maneuver is useless unless on expensive operations, where the operation takes significantly longer time than the hash table insert and lookup.

      2) Schwartzian Transformation on a simple array is an expensive operation.

      3) Simplest sort is the best for very fast functions in the comparison, such as length.