thanks for your code, sorry I didn't explain the problem clear enough.
The input could be encoded in iso-8859-1 \x{f6}\x{f6}, or, maybe in utf-8, \x{c3}\x{b6}, I have to find out what is the charset first.
Encode::Detect::Detector is the one I am using to find out what is the charset of the string, utf-8 or iso-8859-1,
Text::Unaccent unac_string($charset, $str) in my case. Text::Unaccent is working well if Detector can find it the correct code, it failed if Detector failed, of course, no charset.the logic is like: $charset = = Encode::Detect::Detector::detect($input); if($charset eq 'UTF-8'){ # do NFC ... }elsif($charset eq 'iso-8859-1'){ # do NFD ... }
Encode::Detect::Detector normally working well, but failed if input = \x{f6}\x{f6}.
In reply to Re^4: Perl detect utf8, iso-8859-1 encoding
by swiftlet
in thread Perl detect utf8, iso-8859-1 encoding
by swiftlet
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |