in reply to Re: Re: Re: Sort problem
in thread Sort problem
Okay. I got caught by this a few weeks ago.
If you have ['A','BC'] being compared against ['AB','C'] at some point within the sort, then once concatenated, they compare as equal rather than the former being earlier lexically than the latter.
Equally, I have used various separators in the past, control characters (ord(0-31)), del (ord(127)) etc., but the advent of utf8 means that individual bytes of a multi-byte char can legitimately hold these chars, so using them as a separator is no longer viable. (Some would say it never was :).
The only alternative I have found is using a combination of 0xBF0xBE as a seperator. This sequence can never legitimately appear in utf-8 (I believe), but I am not yet confident I have understood the unicode stuff enough to be certain.
..and remember there are a lot of things monks are supposed to be but lazy is not one of them
|
|---|