in reply to problem with UTF-8/YAML/Formfu

Find out whether (and how) YAML allows an encoding specified for its files. Also make sure your file actually is in the encoding it should be.

As a third thing, also make sure that your page is actually served in the encoding you think it should be and has the matching Content-Type header.

Update: Googling for YAML encoding claims that all YAML files are implied to be UTF-8 if there is no BOM, so you either found a bug in whatever YAML library you use (likely) or your file is not UTF-8 (also likely). Check both.

Replies are listed 'Best First'.
Re^2: problem with UTF-8/YAML/Formfu
by lgn8412 (Initiate) on Jan 04, 2011 at 15:27 UTC

    Thanks for the quick response, I've investigated a bit about this, so I've got this

    1. Taken from HTML::FormFu::Manual::Unicode: "If you're using YAML config files, your files will automatically be decoded by load_config_file|HTML::FormFu/load_config_file and load_config_filestem|HTML::FormFu/load_config_filestem."

    2. My YAML file is encoded as UTF-8

    3. My webpage has the charset set to UTF-8 and that doesn't fix it.

    My guess is that somewhere in the yaml file that the formfu decodes and encodes (I think), somewhere in the load_config_file|HTML::FormFu/load_config_file has to be a encode or decode option that needs to be set to UTF-8

    So the real question is.. is there a way to get into the load_config_file thingie to set some options like the one I need?

      My webpage has the charset set to UTF-8 and that doesn't fix it.

      Did you actually encode your webpage using UTF-8?

      When UTF-8 is expected, I would expect exactly that output if the one outputs U+00F1 as byte F1 instead of bytes C3 B1.

      In a UTF-8 terminal:

      $ perl -E'say "\N{U+00F1}";' ? <--- Actually U+FFFD, what you posted $ perl -MEncode -E'say encode("UTF-8", "\N{U+00F1}");' ñ

      If it was an lack of decoding on input as you suspect, one would get multiple gibberish characters rather than the coding error indicated by the character you posted.

      $ perl -MEncode -E'say encode("UTF-8", "\xC3\xB1");' ñ

      It sounds to me like you are properly decoding the text on input while failing to encode it on output.

      Hi, if you're not using Catalyst and setting everything advised in HTML::FormFu::Manual::Unicode, you need to read the whole document, and apply the same principles to your project. In particular, see the "HTML sent to the browser" section of that page.

      btw, I only noticed a couple days ago that DBIx::Class::UTF8Columns is no longer advised, so that part of the manual will need updating - in the meantime, you should see the DBIx::Class docs for details.