in reply to Re: Re: Sort problem
in thread Sort problem

Why doesn't this work out of interest? Internal spaces are retained but the concat does not add spaces....

{ local $" = ''; @out = sort { "@$a" cmp "@$b" } @in; }

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Replies are listed 'Best First'.
Re: Re: Re: Re: Sort problem
by BrowserUk (Patriarch) on Feb 26, 2003 at 21:09 UTC

    Okay. I got caught by this a few weeks ago.

    If you have ['A','BC'] being compared against ['AB','C'] at some point within the sort, then once concatenated, they compare as equal rather than the former being earlier lexically than the latter.

    Equally, I have used various separators in the past, control characters (ord(0-31)), del (ord(127)) etc., but the advent of utf8 means that individual bytes of a multi-byte char can legitimately hold these chars, so using them as a separator is no longer viable. (Some would say it never was :).

    The only alternative I have found is using a combination of 0xBF0xBE as a seperator. This sequence can never legitimately appear in utf-8 (I believe), but I am not yet confident I have understood the unicode stuff enough to be certain.


    ..and remember there are a lot of things monks are supposed to be but lazy is not one of them

    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: Re: Re: Re: Sort problem
by runrig (Abbot) on Feb 26, 2003 at 21:10 UTC
    Because now you are making ['thecat'] the same as ['the', 'cat'] (where the second should sort before the first). If we know we're dealing with plain text, then joining with a null character would be ok. If its unicode, I'm not sure...

    Another map sort map solution would be to use sprintf to concatenate the elements using fixed lengths, but that would require first knowing what the maximum length of any field could be, and making sure your "%Ns" format is at least that large.