in reply to Diacritic-Insensitive and Case-Insensitve Sorting
If you use locale;, you'll get most of the way there. It doesn't exactly consider all accented versions of a character to be identical, but it does sort all versions of A before B. This might be good enough for you. It certainly got me 95% of the way when my job was sorting multilingual dictionaries.
Using a code example from the perllocale pod, this is what I get as the collation order for my locale, en_CA:
0 1 2 3 4 5 6 7 8 9 _ A a À à Á á Â â Ã ã Ä ä Å å Æ æ B b C c Ç ç D d Ð ð E e È è É é Ê ê Ë ë F f G g H h I i Ì ì Í í Î î Ï ï J j K k L l M m N n Ñ ñ O o Ò ò Ó ó Ô ô Õ õ Ö ö Ø ø P p Q q R r S s ß T t U u Ù ù Ú ú Û û Ü ü V v W w X x Y y Ý ý ÿ Z z Þ þ
Using locale is a bit slower than an unadorned sort, but it's far faster and has fewer pitfalls than rolling your own locale-emulation system. lc and uc do exactly what you'd expect under locale, too.
--
bowling trophy thieves, die!
|
|---|