My content-type header looks like:
<meta http-equiv="content-type" content="text/html; charset=utf-8">

The page, script and Apache configuration are all set to UTF-8 right now, but for instance, when I paste text from MS Word into the form, it strips out all of the hi-bit characters. I just posted the source for the script to this thread, please take a look.

And I'll defnitely read that page on i18n form usage, thank you for the link. I'm not sure if it will fix the problem I'm dealing with or not, though. The biggest problem is that I can't count on the page encoding dictating the string encoding I get back, so I'm trying to detect it with a hidden form variable instead. The accept-charset attribute just doesn't seem to work with my client's machines.

I might have to try the windows-1252 charset option you mention. I'm not sure if that'll fix all of the broken behavior either, though. I've still got to escape the text to Latin1 with HTML entities before inserting into MySQL (no unicode support).

Thank You,
Troy


In reply to Re^2: Encoding confusion with CGI forms by davistv
in thread Encoding confusion with CGI forms by davistv

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.