in reply to Re^2: HTML::Entities not encoding @ or .
in thread HTML::Entities not encoding @ or .

<humour>A guy in the back alley. I give him the password "monk" and $20 and he spills the beans....</humour>

No - um, nowhere except in the documentation and the trials described above. The documentation says

The module can also export the %char2entity and the %entity2char hashes, which contain the mapping from all characters to the corresponding entities (and vice versa, respectively).

Which I took to mean "all characters that will be encoded by default". Then observed that encode_entities('@') does not encode @. So I wondered if that was because @ was not in the %char2entity hash, working on the assumption that %char2entity is the list of chars to encode by default. Using help from this board, I exported %char2entity and printed it out

use HTML::Entities; use HTML::Entities qw( %char2entity %entity2char ); #thanks ikegami foreach $val (keys %char2entity) { print "<br>$val => $char2entity{$val}\n"; }
and found that @ IS in the %char2entity hash. Then trying your suggestion (assuming this is the same Anonymous Monk) of
encode_entities($a, "\000-\377");
found that simply telling the module which characters to encode results in them being encoded, even though that command does not supply any new information about code-character mapping. The module, therefore, must already have that information, and it occurs to me that maybe that's the reason not all chars are encoded by default even though %char2entity contains a full set of char-entity relations - becase %char2entity is just a reference hash, NOT the list of chars to be encoded by deafult.

Replies are listed 'Best First'.
Re^4: HTML::Entities not encoding @ or .
by Anonymous Monk on Feb 14, 2008 at 12:53 UTC
    Thats weird thing to do, considering the documentation says The default set of characters to encode are control chars, high-bit chars, and the <, &, >, and " characters..
    Reading the source would also be better
    } else { # Encode control chars, high bit chars and '<', '&', '>', ''' and +'"' $$ref =~ s/([^\n\r\t !\#\$%\(-;=?-~])/$char2entity{$1} || num_enti +ty($1)/ge; }