in reply to XML::RSS

Sounds to me like the RSS is incorrectly claiming that it is UTF-8 when in fact it is Latin-1. That's a common problem when people are writing RSS with hand-rolled ad-hoc tools instead of formal DOM tools.

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.

Replies are listed 'Best First'.
Re: •Re: XML::RSS
by alexg (Beadle) on Mar 21, 2003 at 12:12 UTC
    Ok, I've done some more reading - let me see if I've got this right:

    LWP::Simple returns HTML which is encoded in ISO-8859-1 (Latin-1).
    XML::Parser defaults to UTF-8 (unicode) if the XML does not specify an encoding.

    So using:
    my $rss = new XML::RSS( 'encoding' => 'ISO-8859-1' );

    should work.

    It doesn't.

      Specifying the encoding like that is only used when creating an RSS stream using the library. It won't affect a parsed input stream's encoding.

      --rjray