in reply to Encode string to HTML

You aren't decoding your source string first.

use strict; use warnings; use HTML::Entities; use Encode; my $TestStr = 'ï'; print encode_entities($TestStr) . "\n"; print encode_entities(decode ('utf-8', $TestStr)) . "\n";

Have a read of perlunitut if you haven't already. It'll explain the basics.

Replies are listed 'Best First'.
Re^2: Encode string to HTML
by Corion (Patriarch) on Nov 01, 2013 at 14:50 UTC

    decodeing from utf-8 only helps if the source code is actually encoded as UTF-8. This may or may not be the case.

    At least according to Wikipedia, likely encodings are also ISO 8859-3, ISO 8859-9 or Windows-1254, if guessing that &iuml is supposed to depict a Turkish letter.

      Indeed so - it is nigh on impossible to determine the encoding of a document from a single character, so the actual encoding of the source will only be known by gepebril69. UTF-8 seemed a reasonable first guess in this instance and it does produce the desired output for that one character.

      I've checked the template file and it is

      text/html; charset=utf-8

      Now I understand why I had a similar problem in the past with parsing files. Perl don't seem to auto detect this formatting. It will have a logical reason I guess

Re^2: Encode string to HTML
by gepebril69 (Scribe) on Nov 01, 2013 at 15:22 UTC

    Thanks hippo

    That is very much explaining, so in my case when I want to define unsafe characters I have to use a similar methode.

    my $UnsafeChar = 'ïé'; print encode_entities(decode ('utf-8', $TestStr), decode ('utf-8', $Un +safeChar)) . "\n";