I don't see in ikegami's script the need for use utf8;.
The OP as well as ikegami's script contain the string 'Fräsen und ndk (Kamera - Fräsaufnahme)'. From utf8: "The use utf8 pragma tells the Perl parser to allow UTF-8 in the program text in the current lexical scope. ... Do not use this pragma for anything else than telling Perl that your script is written in UTF-8. ... Because it is not possible to reliably tell UTF-8 from native 8 bit encodings, you need either a Byte Order Mark at the beginning of your source code, or use utf8;, to instruct perl."
Although the "ä" may happen appear to work because it's part of the Latin1 character set, which Perl typically uses internally, it will most likely not do what you want on any Unicode characters outside of that set. As you can see below, the only version of the code in which the UTF8 is flag properly set on the string is the one where the source is encoded as UTF-8 and use utf8; is used. The rule of thumb I always use is to either work entirely in ASCII (using escapes such as \N{} to specify Unicode characters), or otherwise use a UTF-8 encoding on the source code and use utf8;. See also perluniintro and perlunicode.
$ cat with_utf8.pl use warnings; use strict; use utf8; use Devel::Peek; my $string = 'Fräsen und ndk (Kamera - Fräsaufnahme)'; Dump($string); $ perl -pe 's/^(?=.*utf8)/#/' with_utf8.pl | tee without_utf8.pl use warnings; use strict; #use utf8; use Devel::Peek; my $string = 'Fräsen und ndk (Kamera - Fräsaufnahme)'; Dump($string); $ iconv -f UTF-8 -t Latin1 without_utf8.pl -o latin1.pl $ file -i *.pl latin1.pl: text/plain; charset=iso-8859-1 without_utf8.pl: text/plain; charset=utf-8 with_utf8.pl: text/plain; charset=utf-8 $ perl latin1.pl SV = PV(0x1365d70) at 0x13855c0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x13d7160 "Fr\344sen und ndk (Kamera - Fr\344saufnahme)"\0 CUR = 38 LEN = 40 COW_REFCNT = 1 $ perl without_utf8.pl SV = PV(0xa15d70) at 0xa355c0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0xa87190 "Fr\303\244sen und ndk (Kamera - Fr\303\244saufnahme)" +\0 CUR = 40 LEN = 42 COW_REFCNT = 1 $ perl with_utf8.pl SV = PV(0x18d5d70) at 0x18f55d8 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK,UTF8) PV = 0x19384a0 "Fr\303\244sen und ndk (Kamera - Fr\303\244saufnahme) +"\0 [UTF8 "Fr\x{e4}sen und ndk (Kamera - Fr\x{e4}saufnahme)"] CUR = 40 LEN = 42 COW_REFCNT = 1
Updated as per ikegami's reply.
In reply to Re^3: german Alphabet
by haukex
in thread german Alphabet
by shreedara75
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |