Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I am trying to convert a properties file containing some elements in unicode into utf8.

In order to make this clearer, that is an example of what i want to do:

The base file looks like this:

global.lifeTime=\u00C9lettartam global.lifeTime_from=\u00C9lettartam \u00F3ta global.lifeTime_to=\u00C9lettartam eddig fastExport.tooltip=Javaslat a lapoz\u00E1sra\: kattintson a grafik\u00 +E1ra <br /> Tartsa lenyomva a bal eg\u00E9rgombot, mik\u00F6zben az e +geret mozgatja.
What I want to get out looks like this:
global.lifeTime=Élettartam global.lifeTime_from=Élettartam óta global.lifeTime_to=Élettartam eddig fastExport.tooltip=Javaslat a lapozásra: kattintson a grafikára <br /> + Tartsa lenyomva a bal egérgombot, miközben az egeret mozgatja.

Can you tell me how I read the whole file (it has about 2000 lines and transforming all the lines according to the example above?

Best regards and thanks in advance

Tobias

Replies are listed 'Best First'.
Re: How to transform a string containing unicode into utf8?
by tobyink (Canon) on Oct 24, 2012 at 08:53 UTC

    Use a regular expression to grab \u followed by four hex digits, then use hex and chr to convert that to a character.

    use utf8::all; my $str = <<'EOF'; global.lifeTime=\u00C9lettartam global.lifeTime_from=\u00C9lettartam \u00F3ta global.lifeTime_to=\u00C9lettartam eddig fastExport.tooltip=Javaslat a lapoz\u00E1sra\: kattintson a grafik\u00 +E1ra <br /> Tartsa lenyomva a bal eg\u00E9rgombot, mik\u00F6zben az e +geret mozgatja. EOF $str =~ s{ \\u([0-9A-F]{4}) }{ chr hex $1 }egix; print $str;
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
      Great! That works. Thanks a lot!