in reply to UTF8 Output with XML::Feed?

G'day mldvx4,

Note: I've used this common alias of mine in a couple of places:

$ alias perlu alias perlu='perl -Mstrict -Mwarnings -Mautodie=:all -Mutf8 -C -E'

There are two lines in your output that you should note. When I run your code as posted, I get:

... <description>Feed from a to &#xC3;&#xB6;</description> ... <title>abc...&#xC3;&#xA5;&#xC3;&#xA4;&#xC3;&#xB6;</title> ...

When I add use utf8;, I get:

... <description>Feed from a to &#xF6;</description> ... <title>abc...&#xC3;&#xA5;&#xC3;&#xA4;&#xC3;&#xB6;</title> ...

So, that's fixed the $feed->description():

$ perlu 'say chr hex "F6"' ö

but not the $entry->title().

Look at the difference between how you code XML::Feed->new($format) and XML::Feed::Entry->new($format). Aligning those by changing

my $entry = XML::Feed::Entry->new();

to

my $entry = XML::Feed::Entry->new('RSS');

I now get:

... <description>Feed from a to &#xF6;</description> ... <title>abc...&#xE5;&#xE4;&#xF6;</title> ...

So, both the $feed->description() and $entry->title() are now fixed:

$ perlu 'say chr hex for qw{E5 E4 F6}' å ä ö

I'll also draw your attention to "XML::Feed: Atom feeds come out as bytes, but RSS as Unicode [rt.cpan.org #43004] #44". I haven't looked into this but it might have some relevance in relation to other XML::Feed work you may be doing.

— Ken

Replies are listed 'Best First'.
Re^2: UTF8 Output with XML::Feed?
by ikegami (Patriarch) on Mar 07, 2022 at 18:27 UTC
    my $entry = XML::Feed::Entry->new();
    is equivalent to
    my $entry = XML::Feed::Entry->new('Atom');

    So this appears to be a bug on the Atom side of things.

    And the ticket to which you linked supports that.