in reply to Issue with unac_string()

Using Text::Unaccent::PurePerl, Encode, Encode::Detect:
#!/usr/bin/perl -l use strict; use warnings; use Text::Unaccent::PurePerl qw(unac_string); use Encode; require Encode::Detect; my $str = "Los-Cabos-Meli\u00e1"; my $utf8 = decode('Detect', $str); binmode STDOUT, ":encoding(UTF-8)"; print "Original : $utf8"; my $unaccented = unac_string($utf8); print "Unaccent : $unaccented";

Replies are listed 'Best First'.
Re^2: Issue with unac_string()
by Anonymous Monk on May 17, 2012 at 12:48 UTC
    by using utf8, I am getting the follwing warning "Malformed UTF-8 character (1 byte, need 3, after start byte 0xe1)".
Re^2: Issue with unac_string()
by Anonymous Monk on May 17, 2012 at 12:57 UTC
    I tried this code , Yet it didnt work.
    #!/usr/bin/perl use strict; use warnings; use Text::Unaccent::PurePerl qw (unac_string); use URI::Escape; my $str = "Los-Cabos-Meliá"; #my $str = "This is a simple string"; #my $str = "Zo%C3%ABtry-Casa-del- (Mar) -Los-Cabos"; #my $str = "http%3A%2F%2Fwww.travelnow.com%2Fvtours%2F281578.xml"; #my $unescaped = uri_unescape($str); #my $unaccented = unac_string('UTF-8',$unescaped); my $unaccented = unac_string('UTF-8',$str); print "Original : ".$str."\n"; #print "Unescape : ".$unescaped."\n"; print "Unaccent : ".$unaccented."\n";
    the output is:
    Original : Los-Cabos-Meli Unaccent : Los-Cabos-Meli�
      Ah! You're functioning better than I am:). This seems to work for me. Let me know if it works for you.
      #!/usr/bin/perl -l use strict; use warnings; use Encode; require Encode::Detect; use Text::Unaccent::PurePerl qw(unac_string); use URI::Escape::XS qw/uri_escape uri_unescape/; my $str = "Los-Cabos-Meli\303\241"; my $safe = uri_escape($str, "\303\241"); $str = uri_unescape($safe); my $utf8 = decode('Detect', $str); binmode STDOUT, ":encoding(UTF-8)"; print "Origninal: $utf8"; my $unaccented = unac_string($utf8); print "Unaccented: $unaccented";