jkeenan1 has asked for the wisdom of the Perl Monks concerning the following question:
I receive files with human names in ALL CAPS, including accented upper case characters. The files are in iso-8859-1.
CLÉ USB CLÉMMY USB
I need to "proper-case" these words, i.e., Initial Cap All Other Characters Lower Case.
When I open these files in vi, I don't see the upper-case accented E; I see a question-mark.
But when I run these lines through Encode::from_to, I do get the upper case accented E. But no matter what I do, I can't seem to lower-case that upper-case accented E. In fact, the succeeding character remains upper-case as well.CL? USB CL?MMY USB
Results:use strict; use warnings; use feature qw( :5.10 ); use Data::Dumper;$Data::Dumper::Indent=1; use Carp; use Encode qw( from_to ); use POSIX qw( setlocale LC_CTYPE ); setlocale(LC_CTYPE, "fr_CA.ISO8859-1"); my $file = q{./yard}; open my $IN, '<', $file or croak; while (my $l = <$IN>) { chomp $l; say "1: $l"; my $m = $l; from_to($m, "iso-8859-1", "utf8"); say "2: $m"; say "3: ", xlc($m); } close $IN or croak; sub xlc { my $str = shift; return join q{} => ( map { ucfirst(lc($_)) } ( $str =~ m/(\W+|\w+) +/g ) ); };
What am I doing wrong?1: CL? USB 2: CLÉ USB 3: ClÉ Usb 1: CL?MMY USB 2: CLÉMMY USB 3: ClÉMmy Usb
Thank you very much.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Unable to lc upper case accented characters
by Eliya (Vicar) on Feb 19, 2011 at 02:40 UTC | |
by jkeenan1 (Deacon) on Feb 19, 2011 at 03:06 UTC | |
Re: Unable to lc upper case accented characters
by wind (Priest) on Feb 19, 2011 at 02:53 UTC | |
by Jim (Curate) on Feb 20, 2011 at 00:31 UTC | |
Re: Unable to lc upper case accented characters
by Jim (Curate) on Feb 19, 2011 at 03:40 UTC |