Unfortunately, no, there is no real premade way to translate them into ASCII because those characters do not exist in ASCII. What to do depends on requrements of your database and what scheme you will use to convert the various characters.

If your db will support Unicode, the easiest thing to do would be to convert the encoding from cp1252 to utf-8.

use Encoding; my $record = 'whatever'; from_to($record, 'cp1252', 'utf-8');

If it will handle Latin-1, then all you need to do is handle the characters from \x80-\x9F than re-encode to 'iso-8859-1'. Other than \x80-\x9F, MS cp1252 and Latin-1 are identical.

If you really need ASCII, you are going to have to come up with your own transliteration scheme for which ASCII character is an acceptable replacement for the "upper" characters.

There is a list of cp1252 characters with their Unicode codepoints available at http://www.microsoft.com/typography/unicode/1252.htm


In reply to Re: Mass regsub on High-bit chars. by thundergnat
in thread Mass regsub on High-bit chars. by abaxaba

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.