in reply to Re: Sort problem
in thread Sort problem

Just concatenate and remove the spaces and throw in a Schwartzian Yransform for efficiency....like this

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Replies are listed 'Best First'.
Re: Re: Re: Sort problem
by dws (Chancellor) on Feb 26, 2003 at 19:55 UTC
    Just concatenate and remove the spaces and ...

    You and I have apparently come to quite different understandings of the problem BrowserUK posed. He left a few details out, but I assume that he's illustrating the shape of the data, and is providing a valuable clue to the actual content of the data when he writes "once the elements can be strings that could contain embedded spaces".

    If my assumption is true, then approaches that use concatenation will fail for some inputs, notably some inputs that contain embedded spaces.

    Perhaps this is a good opportunity for BrowserUK to clarify his intent.

      The example I posted uses concatenation but removes the spaces \040 and thus the problem.

      @in = ( ['the cat', 'sat', 'on', 'the', 'mat'], ['the cat sat', 'wherever if felt like'], + ['the cat'], ); my @out = map{$_->[0]} sort{$a->[1] cmp $b->[1]} map{[$_, concat($_ +)]} @in; sub concat { my $ary = shift; $ary = join '', @$ary; $ary =~ s/\s//g; return $ary; } use Data::Dumper; $Data::Dumper::Indent = 0; print map {s/\[/\n \[/g; $_} Dumper \@out; __DATA__ $VAR1 = [ ['the cat'], ['the cat','sat','on','the','mat'], ['the cat sat','wherever if felt like']];

      Please explain to me why this is not appropriately sorted and how it is failing? It seemed clear enough to me that he did not want 'the cat' to sort before ('the', 'cat') simply because char 4 of 'the cat' is \040. If I misread the problem and he wants to sort including embeded spaces but not getting a space when you concat with "@ary" then all is required is a local $" = ''; before the sort.

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        The example I posted uses concatenation but removes the spaces \040 and thus the problem. ... Please explain to me why this is not appropriately sorted ...

        To borrow from the standard example of why naive hyphenation can cause problems, if you remove spaces,   [qw(therapist)] sorts the same as   [qw(the rapist)]

        Try it with this sample data, for example:

        @in = ( ['the cat sat'], ['the ', 'cat', 'sat'], ['the cat', 'sat'], ['the', 'cat', 'sat'], );

        I would expect it to be sorted 4, 2, 3, 1. Concatenating and compressing the spaces, then using the ST returns the array in its original order.