Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Diacritic-Insensitive and Case-Insensitve Sorting

by Willard B. Trophy (Hermit)
on Jan 05, 2004 at 15:02 UTC ( [id://318859]=note: print w/replies, xml ) Need Help??


in reply to Diacritic-Insensitive and Case-Insensitve Sorting

If you use locale;, you'll get most of the way there. It doesn't exactly consider all accented versions of a character to be identical, but it does sort all versions of A before B. This might be good enough for you. It certainly got me 95% of the way when my job was sorting multilingual dictionaries.

Using a code example from the perllocale pod, this is what I get as the collation order for my locale, en_CA:

0 1 2 3 4 5 6 7 8 9 _ A a À à Á á Â â Ã ã Ä ä Å å Æ æ B b C c Ç ç D d Ð ð E e È è É é Ê ê Ë ë F f G g H h I i Ì ì Í í Î î Ï ï J j K k L l M m N n Ñ ñ O o Ò ò Ó ó Ô ô Õ õ Ö ö Ø ø P p Q q R r S s ß T t U u Ù ù Ú ú Û û Ü ü V v W w X x Y y Ý ý ÿ Z z Þ þ

Using locale is a bit slower than an unadorned sort, but it's far faster and has fewer pitfalls than rolling your own locale-emulation system. lc and uc do exactly what you'd expect under locale, too.

--
bowling trophy thieves, die!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://318859]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-18 22:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found