in reply to Re: Re: Re: Sort problem
in thread Sort problem

Okay. I got caught by this a few weeks ago.

If you have ['A','BC'] being compared against ['AB','C'] at some point within the sort, then once concatenated, they compare as equal rather than the former being earlier lexically than the latter.

Equally, I have used various separators in the past, control characters (ord(0-31)), del (ord(127)) etc., but the advent of utf8 means that individual bytes of a multi-byte char can legitimately hold these chars, so using them as a separator is no longer viable. (Some would say it never was :).

The only alternative I have found is using a combination of 0xBF0xBE as a seperator. This sequence can never legitimately appear in utf-8 (I believe), but I am not yet confident I have understood the unicode stuff enough to be certain.


..and remember there are a lot of things monks are supposed to be but lazy is not one of them

Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.