in reply to Re^3: UTF8 Output with XML::Feed? (use utf8)
in thread UTF8 Output with XML::Feed?

> The code you presented only contains 7-bit ASCII characters.

erm ... åäö???

update

> Do not use this pragma for anything else than telling Perl that your script is written in UTF-8.

Unfortunately this line is easily misunderstood. I recently had a long dispute with a camel award winner who read it wrongly.

Many think it only means you can use unicode characters for identifiers, like $möhre or sub née but it covers also literal strings read thru the same file-handle DATA.

Please note how the UTF8 flag is set for $t2 (see FLAGS)

use v5.12; use warnings; use Devel::Peek; my $t1='åäö'; Dump $t1; use utf8; my $t2='åäö'; Dump $t2; my $t3 = "\N{LATIN SMALL LETTER A WITH RING ABOVE}\n"; say $t3; Dump $t3;
OUTPUT:
SV = PV(0xd9ae08) at 0x25809b0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x260a4e8 "\303\245\303\244\303\266"\0 CUR = 6 LEN = 10 COW_REFCNT = 1 SV = PV(0xd9add8) at 0x2580248 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK,UTF8) PV = 0x260a068 "\303\245\303\244\303\266"\0 [UTF8 "\x{e5}\x{e4}\x{f6 +}"] CUR = 6 LEN = 10 COW_REFCNT = 1 SV = PV(0xd9afe8) at 0x2580a40 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK,UTF8) PV = 0x2767378 "\303\245\n"\0 [UTF8 "\x{e5}\n"] CUR = 3 LEN = 10 COW_REFCNT = 1 å

UPDATE2

extended the code with $t3, which doesn't print an empty line for me but å

UPDATE3

of course, how the print is displayed depends also on the output channel and the display settings.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^5: UTF8 Output with XML::Feed? (updated)
by pryrt (Abbot) on Mar 07, 2022 at 20:28 UTC
    erm ... åäö???

    I had the same thought on my first reading of that post, and even send a /msg to that effect.

    But then I reread kcott's post, and saw that the second block of code from the earlier post was the code kcott focused on, and was presumably the code that kcott said didn't need use utf8; -- which seems right, because it doesn't contain non-ASCII characters.

      yes I realized it in the meantime.

      I didn't expect it but \N{} is automatically activating the utf8 flag for the surrounding string. (which makes sense in hindsight)

      use v5.12; use warnings; use Devel::Peek; #use open OUT => qw{:encoding(UTF-8) :std}; my $t4 = "\N{LATIN SMALL LETTER A WITH RING ABOVE}\n"; Dump $t4; warn "t4: ",$t4; __DATA__
      OUTPUT:
      SV = PV(0xe7ae08) at 0xd308f0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK,UTF8) PV = 0x2742188 "\303\245\n"\0 [UTF8 "\x{e5}\n"] CUR = 3 LEN = 10 COW_REFCNT = 1 t4: å

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

Re^5: UTF8 Output with XML::Feed? (updated)
by kcott (Archbishop) on Mar 08, 2022 at 01:44 UTC

    G'day Rolf,

    I wrote my post (to which you replied) very early this morning ... then there was $work ... now it's my lunchtime and I see your original question has been resolved (with some input from pryrt). So, there's probably little more for me to say about that specifically.

    Thanks for all of the extra work you did: Devel::Peek and so on.

    By the way, I got a huge laugh from your latest user image. Thanks for that also.

    — Ken