, also known as is the U+00A0: NO-BREAK SPACE.
∼ is U+223C: TILDE OPERATOR.
HTML::Entities properly handles both just fine:
>perl -e"use HTML::Entities qw( decode_entities ); printf('U+%04X', or +d(decode_entities($ARGV[0])))" " " U+00A0 >perl -e"use HTML::Entities qw( decode_entities ); printf('U+%04X', or +d(decode_entities($ARGV[0])))" "∼" U+223C
I suspect you have a bug in your output code. You're probably forgot to encode the text string returned by decode_entities into a binary string appropriate for your terminal or the file into which you outputting the string.
This can be done by adding the :encoding(...) layer on open, by adding the :encoding(...) layer using binmode, or by explicitly encoding using Encode's encode function.
In reply to Re: special HTML Characters
by ikegami
in thread special HTML Characters
by chuck_norris
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |