The following list of codes is missing a few, and I've only included the ones for ASCII values 128+, but it should give you an idea of how to do this. Just add or remove codes as necessary. I generated my listing by parsing a random "HTML Character Codes" Google search result.
use strict; use warnings; my %codes = ( '128' => ['Ä', 'ä'], '129' => ['Å', 'å'], + '130' => ['Ç', 'ç'], '131' => ['É', 'é'] +, '132' => ['Ñ', 'ñ'], '133' => ['Ö', 'ö'], '134' => ['Ü', 'ü'], '135' => ['á', 'á'] +, '136' => ['à', 'à'], '137' => ['â', 'â'], + '138' => ['ä', 'ä'], '139' => ['ã', 'ã'] +, '140' => ['å', 'å'], '141' => ['ç', 'ç'] +, '142' => ['é', 'é'], '143' => ['è', 'è'] +, '144' => ['ê', 'ê'], '145' => ['ë', 'ë'], '146' => ['í', 'í'], '147' => ['ì', 'ì'] +, '148' => ['î', 'î'], '149' => ['ï', 'ï'], '150' => ['ñ', 'ñ'], '151' => ['ó', 'ó'] +, '152' => ['ò', 'ò'], '153' => ['ô', 'ô'], + '154' => ['ö', 'ö'], '155' => ['õ', 'õ'] +, '156' => ['ú', 'ú'], '157' => ['ù', 'ù'] +, '158' => ['û', 'û'], '159' => ['ü', 'ü'], '160' => ['†', '†'], '161' => ['ϒ', 'ϒ'], + '162' => ['′', '′'], '163' => ['£', '£'], + '164' => ['§', '§'], '165' => ['•', '•'], + '166' => ['¶', '¶'], '167' => ['♣', '♣'] +, '168' => ['♦', '♦'], '169' => ['♥', '♥' +], '170' => ['♠', '♠'], '171' => ['↔', '↔'], + '172' => ['←', '←'], '173' => ['≠', '≠'], '174' => ['→', '→'], '175' => ['↓', '↓'], + '176' => ['∞', '∞'], '177' => ['±', '±'] +, '178' => ['≤', '≤'], '179' => ['≥', '≥'], '180' => ['×', '×'], '181' => ['∝', '∝'], + '182' => ['∂', '∂'], '183' => ['∑', '∑'], '184' => ['∏', '∏'], '185' => ['π', 'π'], '186' => ['≡', '≡'], '187' => ['ª', 'ª'], '188' => ['º', 'º'], '189' => ['Ω', 'ω'], + '190' => ['æ', 'æ'], '191' => ['↵', '↵'] +, '192' => ['ℵ', 'ℵ'], '193' => ['ℑ', 'ℑ'] +, '194' => ['ℜ', 'ℜ'], '195' => ['√', '√'] +, '196' => ['⊗', '⊗'], '197' => ['⊕', '⊕'] +, '198' => ['∅', '∅'], '199' => ['∩', '∩'], '200' => ['∪', '∪'], '201' => ['⊃', '⊃'], '202' => [' ', ' '], '203' => ['⊄', '⊄'], + '204' => ['⊂', '⊂'], '205' => ['⊆', '⊆'], + '206' => ['∈', '∈'], '207' => ['∉', '∉'] +, '208' => ['∠', '∠'], '209' => ['∇', '∇'] +, '210' => ['“', '“'], '211' => ['”', '”'] +, '212' => ['‘', '‘'], '213' => ['’', '’'] +, '214' => ['÷', '÷'], '215' => ['◊', '◊'], '216' => ['ÿ', 'ÿ'], '217' => ['∧', '∧'], '218' => ['∨', '∨'], '219' => ['⇔', '↔'], + '220' => ['⇐', '←'], '221' => ['⇑', '↑'], + '222' => ['⇒', '→'], '223' => ['⇓', '↓'], + '224' => ['‡', '†'], '225' => ['〈', '⟨'], + '226' => ['‚', '‚'], '227' => ['„', '„'] +, '228' => ['‰', '‰'], '229' => ['Â', 'â'], + '230' => ['Ê', 'ê'], '231' => ['Á', 'á'] +, '232' => ['Ë', 'ë'], '233' => ['È', 'è'] +, '234' => ['Í', 'í'], '235' => ['Î', 'î'], + '236' => ['Ï', 'ï'], '237' => ['Ì', 'ì'] +, '238' => ['Ó', 'ó'], '239' => ['Ô', 'ô'], + '241' => ['〉', '⟩'], '242' => ['Ú', 'ú'] +, '243' => ['Û', 'û'], '244' => ['Ù', 'ù'] +, '246' => ['ˆ', 'ˆ'], '247' => ['˜', '˜'], + '248' => ['¯', '¯'], '252' => ['¸', '¸'], + '255' => ['š', 'š'] ); my $text = '™£¢??§¶•ª'; $text =~ s/([\x80-\xFF])/$codes{ord($1)}[1]/g; print $text;

In reply to Re: Detecting Strange Characters in Text? by TedPride
in thread Detecting Strange Characters in Text? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.