Re: noncase-sensitive sorting

The sort follows only sorts the first part of the string terminated at \0.

Eh, no, it doesn't (unless there's a bug ;-)). The sort compares the entire strings (well, up to the point they are different). The trick is that the original strings are prefixed (with a fixed length string) in such away that the prefix itself is enough to determine how to sort, unless the strings are equal (*). Strings shorter than the maximum length string are padded with NUL characters because NULs sort before any other character, and normally 'a' is sorted before 'ab'.

In fact, a GRT and a ST are doing to same trick, but where ST uses a 2 element array, one element the original value, the other the computed value, GRT combines both values into a single string.

(*) Well, not quite. If a string would end in a NUL (which doesn't happen in the benchmark), it would give the same prefix as a similar string without the trailing NUL. Say for instance the max length is 4, and we have two strings, "ab" and "ab\0". The both will have a "ab\0\0" prefix, but the strings used in the sort will differ: "ab\0\0ab" vs. "ab\0\0ab\0". In this case, the cmp will reach the end of the entire string, even while the strings differ. It'll go beyond the prefix.

Abigail

Comment on Re: noncase-sensitive sorting