in reply to Re: LWP gives funky characters
in thread LWP gives funky characters

require LWP::UserAgent; my $ua = LWP::UserAgent->new; my $response = $ua->get($ARGV[0]); $_=$response->content;
then I take the the resulting page and find the DIV of interest and extract the text. That text goes in the database. Later another page selects the text and puts it in a textarea.

The page is coming in text/html; charset=UTF-8, and my destination page is also text/html; charset=UTF-8.

I'll look at Encode / decode('UTF-8'... that might be exactly what I need.

Thanks.

Replies are listed 'Best First'.
Re^3: LWP gives funky characters
by ikegami (Patriarch) on Jan 25, 2007 at 01:45 UTC

    Then it's the first possibility. You're missing:

    print $cgi->header(-type=>'text/html', -charset=>'UTF-8');

    You probably have

    print $cgi->header(-type=>'text/html');

    which is the same as

    print $cgi->header(-type=>'text/html', -charset=>'ISO-8859-1');

    That tells the browser you are using one character set when you are using another.

      Hmmm... I'm pretty stuck.

      As I'm looking at it I think Perl is probably doing the right thing, but that the UTF8 string is getting double encoded somewhere else. Thanks, though. That's what I get for going perl -> sql -> aspx -> javascript -> dhtml.... I'm hoping I can fix it coming out of aspx or in the javascript.