in reply to Re: UTF-8 webpage output from MySQL
in thread UTF-8 webpage output from MySQL

I thought these lines in my CGI::Application baseclass took care of the input, output encoding
binmode STDIN, ":encoding(utf8)"; binmode STDOUT, ":encoding(utf8)";
and that my main problem was the data from the database.

Replies are listed 'Best First'.
Re^3: UTF-8 webpage output from MySQL
by moritz (Cardinal) on Jan 22, 2008 at 12:35 UTC
    In CGI scripts STDIN is only used to read POST data, so that's not all that interesting.

    But the problems with the templates remain - as long as you use HTML::Template, you'll have to be very careful not to mix binary and text strings. So if you don't want to waste your sanity on charset issues, you should really switch to a template system that is aware to character encodings.

    And since your templates have HTML::Template syntax I recommend one of the drop-in replacements, that is HTML::Template::Compiled or Template::Alloy.

    The line decode_utf8($tmpl->output); in the OP demonstrates that you decode the template's output. So if HTML::Template provides you with binary data, and DBI returns upgraded data (aka text strings), your problem actually occured much earlier.

      I lost my sanity over this a long time age...

      I do not want to switch template system, I can't be the only one that is using H::T with utf-8?

      I don't use the code below:

      decode_utf8($tmpl->output);
      it was just to show that if I did use it, my template files would display the reverse question mark sign instead of å,ä and ö. but my database data would display correct.

        ... and I explained why - because you mix binary and text (upgraded) data.

        If you don't want to switch template systems (foolish IMHO, but it's your choice) you can still make sure that all input to HTML::Template is binary data, not upgraded text strings. And make sure that no IO layer tries to encode them once again.

        Or you can go the the bug tracker, and locally apply one of the two patches that add an open mode to HT.

        Is there a particular reason you want to stick to a broken(*) template system? You don't have to change a single line in your templates if you use one that emulates HT's syntax.

        (*): broken for this application. I once asked about Handling Encoding in Templates, and there was no viable solution offered for HTML::Template.