Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I followed each step from: Unicode Wiki

When I use $c->response->body($content); ### $content contains some weird characters like Chinese :) I still got a a warning message from my catalyst trace message:
Wide character outside byte range in response. Encoding data as UTF-8 at C:/Perl64/site/lib/Plack/Util.pm line 91 (I used: perl script/myapp_server.pl -d -r -p 3000)

Do I still have to encode the message $content even though I set up: encoding=> 'utf8'? (I am using Catalyst 5.9007)
When I did: $content= decode("utf-8", $content); or utf8::decode($content), the warning went away.
In my understanding, Catalyst::Plugin::Unicode::Encoding automatically handles the encoding and decoding of messages (ins and outs). Is this true or I misunderstood?
Happy 2015!!!
  • Comment on Catalyst: Wide character outside byte range in response. Encoding data as UTF-8

Replies are listed 'Best First'.
Re: Catalyst: Wide character outside byte range in response. Encoding data as UTF-8
by Anonymous Monk on Dec 31, 2014 at 22:56 UTC
    This is pretty weird... Apparently Catalys encodes decoded content (so far so good) but otherwise decodes stuff and feeds decoded stuff to HTTP::Server::PSGI??? (which is where the warning comes from)
    package HTTP::Server::PSGI; ... sub _encode { if ($_[0] =~ /[^\x00-\xff]/) { Carp::carp("Wide character outside byte range in response. Enc +oding data as UTF-8"); utf8::encode($_[0]); } }
    What is $content? Where does it come from?
Re: Catalyst: Wide character outside byte range in response. Encoding data as UTF-8
by locked_user sundialsvc4 (Abbot) on Dec 31, 2014 at 17:00 UTC

    If you called decode() on $content before passing it to the template, then, yes, you would “cause the message to go away.”   But you have altered the content possibly in a destructive way as-seen by a Chinese user (but perhaps, not as-seen by non-Chinese you).

    What you probably want to do is to explicitly encode() the string before passing it to the template, thereby relieving the ever-vigilant Catalyst from having to do this for you.   If the string contains no UTF characters, encoding will do nothing to the content.   If it does, it will do the right thing to it.

    decoding would be applied to incoming data, not outgoing data.   (And if I have this backwards, you’ll see my mistake being pointed-out by snarky replies and downvotes within a matter of minutes if not seconds ...)

    I am not familiar with that plug-in, but what I would do is to look at the source-code of it.   (Who knows, maybe that plug-in is what is actually producing the message?)   Anyhow, I prefer not to rely upon plug-ins to do such things.   I prefer to have full and conscious control of when and how every encoding/decoding step is done, because plug-ins necessarily have to “guess.”

      In order to handle unicode issues, many things need to be checked and enabled. If there is one flag to turn on utf8, that would be very nice.