in reply to Unicode encoding

my $str = '\u00E1\u015F'; $str =~ s/\\u([0-9a-fA-F]{4,6})/chr(hex($1))/eg; open my $handle, '>:encoding(UTF-8)', $file or die "Can't open file `$file' for writing: $!"; print $handle $str; close $handle or warn $!;

See also Character encodings and Unicode in Perl.

Replies are listed 'Best First'.
Re^2: Unicode encoding
by Juerd (Abbot) on Jun 21, 2009 at 23:49 UTC

    $str =~ s/\\u([0-9a-fA-F]{4,6})/chr(hex($1))/eg;

    It looks like Java unicode escapes. These always have exactly 4 digits as far as I've seen. \u20AC80 would mean €80, but your example would not see it that way :)

    s/\\u([0-9A-Fa-f]{4})/chr hex $1/ge;

    or die "Can't open file `$file' for writing: $!";

    Backticks and apostrophes as balanced quotes are ugly, and not at all balanced, in just about any font.