yeehaw has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I'm searching for a mechanism to encode a string like "Hello ä" into "Hello ä/ä" to write it in a html file. I need this because i'm only generate a html snippet which is be included in a whole html page. The whole html page hasn't the utf-8 charset and i cannot handle it. I tried to use HTML:Entities but this only gave me some strange codes like this: öä for öa My Code is:
open(OUTTEMP, '>:utf8', "$outfile"); print OUTTEMP "some text" . encode_entities(öä) . "some text\n";
Thanks, Yeehaw
  • Comment on How to encode special characters like ä,ß,é,... in html unicode codes like ä or ä
  • Select or Download Code

Replies are listed 'Best First'.
Re: How to encode special characters like ä,ß,é,... in html unicode codes like ä or ä
by Corion (Patriarch) on Oct 11, 2011 at 11:33 UTC

    From your output of öä, I would assume that you have not told Perl that the data you are reading in is UTF-8 as well. You will need to make sure that Perl knows that the data you are reading in are UTF-8 characters, by using either the parameters for open or by telling your database driver to mark the data as UTF-8.

    If all else fails, you can manually mark your data as containing UTF-8 encoded data using Encode::decode($data, 'utf-8'), but I would do that at the source instead of doing it at the end of processing.

      Thanks, with the Encode::decode($data, 'utf-8'), it works great.
Re: How to encode special characters like ä,ß,é,... in html unicode codes like ä or ä
by moritz (Cardinal) on Oct 11, 2011 at 11:28 UTC
      But why there are coming these strange codes from HTML:Entities?
Re: How to encode special characters like ä,ß,é,... in html unicode codes like ä or ä
by trizen (Hermit) on Oct 11, 2011 at 14:30 UTC
    use encoding 'utf8'; use HTML::Entities 'decode_entities'; my $string = 'Hello ä/ä ... ß,é'; $string =~ s/(.)/ord($1)>160?'&#'.ord($1).';':$1/ge; print "$string\n"; # encoded print decode_entities($string),"\n"; # decoded
    The principle is this:
    use encoding 'utf8'; print '&#' . ord('ä') . ';'; # prints ä
Re: How to encode special characters like ä,ß,é,... in html unicode codes like ä or ä
by Khen1950fx (Canon) on Oct 11, 2011 at 12:21 UTC
    I tried like this:
    #!/usr/bin/perl use strict; use warnings; use HTML::Entities; my $e = "\366\344"; open OUTTEMP, '>', '/root/Desktop/html.log'; binmode OUTTEMP, ':encoding(utf8)'; print OUTTEMP "some text" . encode_entities($e) . "some text\n"; close OUTTEMP;
    Or even easier:
    #!/usr/bin/perl -l use strict; use warnings; use HTML::Entities; my $e = "\366\344"; print encode_entities($e);