in reply to Do I have a unicode problem, or is this something else?

What ikegami is saying is that the data you get from your database is indeed utf8 data, but perl is unaware that this is the case, and its default behavior is to treat it as bytes (which end up, in your situation, as single-byte Latin-1 characters). So...
binmode STDOUT, ":utf8"; my $u = __t("Patient Id"); utf8::decode( $u ); print $u,"\n";
Please check the utf8 manual for more details. In some situations, it might be preferable to use the Encode module:
use Encode; binmode STDOUT, ":utf8"; my $u = decode( "utf8", __t("Patient Id")); print $u;
(updated to use "$u" instead of "$a" -- lexical instances of perl globals can lead to confusion and anxiety...)

Replies are listed 'Best First'.
Re^2: Do I have a unicode problem, or is this something else?
by Steve_BZ (Chaplain) on Jun 10, 2010 at 15:16 UTC

    Hi Graff,

    Thanks for this. So what I understand is the use utf8 that I have in my modules will just simplify any

    binmode STDOUT, ":utf8"; my $a = __t("Patient Id"); utf8::decode( $a ); print $a,"\n";
    to
    binmode STDOUT, ":utf8"; my $a = __t("Patient Id"); decode( $a ); print $a,"\n";

    Presumably I can also insert this code into __t() and not worry about putting it elsewhere.

    Thanks for this: very helpful.

    Have a good day.

    regards

    Steve

      So what I understand is the use utf8 that I have in my modules will just simplify any ... to ...

      If you think this is an enhancement -- and you have no other reason for use utf8 in your code -- I would consider it a false "advantage", especially if you need (now or in the future) to add use Encode to your script, since you will then have a clash in how the decode() function is defined.

      Did you notice this (rather prominent) passage in the perldoc "utf8" man page?

      Do not use this pragma for anything else than telling Perl that your script is written in UTF-8.

      (Italics added, bold in original.)

        Hi Graff,

        Well in fact the first mechanism didn't work for me but the second did. I'm sure it was something to do with my own code! So I changed use utf8 to use Encode throughout and everything sprung to life, incluyding some code which I didn't know wasn't working! So now I have use Encode but not use utf8. Do you think that is OK?

        Regards

        Steve

      utf8 doesn't export any functions, at least not by default. Your second snippet doesn't run.

      $ perl -e'use utf8; decode($_)' Undefined subroutine &main::decode called at -e line 1.

        Hi ikegami,

        You're right I got the same error. I used use Encode in the end.

        Regards

        Steve