Re: Diacritic-Insensitive and Case-Insensitve Sorting

If you use locale;, you'll get most of the way there. It doesn't exactly consider all accented versions of a character to be identical, but it does sort all versions of A before B. This might be good enough for you. It certainly got me 95% of the way when my job was sorting multilingual dictionaries.

Using a code example from the perllocale pod, this is what I get as the collation order for my locale, en_CA:

0 1 2 3 4 5 6 7 8 9 _ A a À à Á á Â â Ã ã Ä ä Å å Æ æ B b C c Ç ç D d Ð ð E e È è É é Ê ê Ë ë F f G g H h I i Ì ì Í í Î î Ï ï J j K k L l M m N n Ñ ñ O o Ò ò Ó ó Ô ô Õ õ Ö ö Ø ø P p Q q R r S s ß T t U u Ù ù Ú ú Û û Ü ü V v W w X x Y y Ý ý ÿ Z z Þ þ

Using locale is a bit slower than an unadorned sort, but it's far faster and has fewer pitfalls than rolling your own locale-emulation system. lc and uc do exactly what you'd expect under locale, too.

--
bowling trophy thieves, die!

Comment on Re: Diacritic-Insensitive and Case-Insensitve Sorting Select or Download Code