in reply to Handling HTML special characters correctly

The way that text is encoded in a form POST depends on the encoding of the HTML page containing the form. So it is always advisable to explicitly declare your page encoding. This should be done in the response header with:
Content-Type: text/html; charset=UTF-8
and another way is use the META tag in your HTML output:
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

Also, when you receive form parameters in a CGI script, you always need to decode them according to how they were encoded by the form:

use CGI qw(:standard); use Encode; ... my $name = Encode::decode('utf-8', scalar(param('name')));
Now $name will contain code-points which is probably the most useful representation for your application. From there you can convert it to any other particular encoding when you need to.

This article: Character Conversions from Browser to Database does a good job of explaining the issues involved.

Replies are listed 'Best First'.
Re^2: Handling HTML special characters correctly
by cosmicperl (Chaplain) on Jul 03, 2008 at 00:33 UTC
    Thanks for the link, I'm sure it'll be very useful for my future projects :D