in reply to Re^2: Accented letter is not capitalised
in thread Accented letter is not capitalised

Somebody wrote:

#!/usr/bin/perl use warnings; use strict; use Encode qw(encode decode); my $enc = 'utf-8'; # This script is stored as UTF-8 my $str = "úlcera\n"; # Byte strings: print ucfirst $str; # prints 'úlcera', ucfirst didn't have any effect.
Whoa, that’s never going to work! And you should (very very very almost) not ever need to be calling encode/decode yourself, either.

Honest, this is really very easy. Watch:

use utf8; use strict; use warnings; use warnings FATAL => "utf8"; use feature "unicode_strings"; # or use v5.12 or superior use open qw(:std :utf8); print ucfirst("úlcera\n");
...very most assuredly does indeed print out Úlcera. Don’t go by appearances: trust only the numbers. Thus:
$ perl ultstertest Úlcera $ perl ulstertest | uniquote -x \x{DA}lcera $ perl ulstertest | uniquote -v \N{LATIN CAPITAL LETTER U WITH ACUTE}lcera $ perl ulstertest | uniquote -b \xC3\x9Alcera
The outer pair of tests above risk only confusion; it is the inner pair that are wholly dispositive and convincing: trust the output of uniquote -v and uniquote -x to give you something you can actually read and depend on.

Like I said, just play it by the numbers.

--tom

Replies are listed 'Best First'.
Re^4: Accented letter is not capitalised
by Steve_BZ (Chaplain) on Feb 17, 2012 at 16:05 UTC

    Hi Tom,

    Thanks for this, it looks just the job, unfortunately it says unicode_strings is not available in 5.10.1. Is there an equivalent piece of code that doesn't require 5.12.1?

    Regards

    Steve

      Thanks for this, it looks just the job, unfortunately it says unicode_strings is not available in 5.10.1. Is there an equivalent piece of code that doesn't require 5.12.1?

      If you remove the feature line and leave all the rest, I get the same correct answer with v5.10.0 and v5.10.1 alike.

      --tom