http://qs1969.pair.com?node_id=866102


in reply to The Björk Situation

Try this clip, if your code is in Unicode-land:
# Function: translate_diacriticals() # # Remove diacritical marks (e.g. ümlauts, hebrew vowels, etc) # for use in fuzzy matches, or for avoiding excess information loss # when encoding to restricted character sets like ASCII. # # See also: # http://www.perlmonks.org/?node_id=835238 # http://en.wikipedia.org/wiki/Diacritic # http://en.wikipedia.org/wiki/Unicode_equivalence # http://unicode.org/reports/tr15/ # http://www.faqs.org/rfcs/rfc3454.html # sub translate_diacriticals($) { my $str = Unicode::Normalize::NFKD($_[0]); $str =~ s/\p{NonspacingMark}//g; return $str; }