in reply to Re^3: Normalizing diacritics in (regex) search
in thread Normalizing diacritics in (regex) search

Thanks, very similar to jo37's solution (and not really using Unicode::Collate like you suggested° ;-)

But jo37's approach with NFD is IMHO better because of the "dangers of pathological characters" I mentioned...

Consider U+3374 ㍴: NFKD will decompose it to "bar", NFD won't. That means a symbol/character "㍴" might match in "Barbra Streisand". So if eliminating diacritics is the goal, NFD is preferable.

°) For completeness: There is an example in Unicode::Collate, demonstrating normalized search with a (broken°) German phrase

Alas I didn't "study" this module sufficiently to tell if this is exactly matching my requirements to only ignore diacritics.

Cheers Rolf
(addicted to the Perl Programming Language :)
see Wikisyntax for the Monastery

°) Ha :) ... you can almost hear an English accent with this word order, OTOH I suppose it's easier to decipher for English speakers than "Ich muss Perl studieren". (Which is still slightly off, "lernen" would be better in this case)