What I mean is to convert ascii encoded unicode characters, as in my example, to normal unicode. For example (in bold are the ASCII encoded unicode chars):
Compruebe si las direcciones URL que encontr\u00e9 en el archivo de configuraci\u00f3n son v\u00e1lidos
I can't figure out how to convert those representations to normal characters.
I tried your suggestion, it seems like, it just removes the following character :(
Here is the result that I get, in comparison to the original string:
Compruebe si las direcciones URL que encontr\u00e9 en el archivo de configuraci\u00f3n son v\u00e1lidos
Compruebe si las direcciones URL que encontren el archivo de configuraci son vidos
Here is what it shoud of been:
Compruebe si las direcciones URL que encontré en el archivo de configuración son válidos
| [reply] |
Is your STDOUT in UTF-8 mode? (as in binmode STDOUT, ':encoding(utf-8), for example). That's the way to get 'normal Unicode' in Perl. Does this work for you:
my $str = 'encontr\u00e9 configuraci\u00f3n v\u00e1lidod';
$str =~ s/ \\u ( \p{Hex}{4} ) / chr hex $1 /gex;
binmode STDOUT, ':encoding(utf-8)';
print $str, "\n";
? | [reply] [d/l] [select] |
Thanx, I did 'use utf8;', but binmode, fixed my problem. I am still struggling to understand unicode concepts. :(.
With binmode, it works using ether hex or eval, also, I've tried pack('U', hex $1), it
works.
For some reason, I thought, pack 'U', takes hex digits as an argument, but actually I've had to convert it to decimal, because pack('U', $1) didn't work.
| [reply] [d/l] [select] |