in reply to LWP::Simple // Special Character problems.

The target web-page contains the tex: "æ, ø, å"
That indicates that the web-page in question did not properly encode unsafe characters using HTML::Entities or equivalent, right? But that probably not your fault, unless of course it's your own site. But you can decode it yourself:
use strict; use warnings; use LWP::Simple; use HTML::Entities; my $str = decode_entities(get(q{http://www.uio.no})); my @arr = split('\s+', $str); for (@arr) { print if (m/[æøå]/i); } __END__ største ønsker å søk--> <!--Søk søkeknapp alt="Søk" value="Søk" Walløe forskingsråd</a><span å nivået ...
Update: Please disregard this post. It is wrong and misleading.

Andreas
--

Replies are listed 'Best First'.
Re^2: LWP::Simple // Special Character problems.
by ikegami (Patriarch) on May 24, 2007 at 17:32 UTC
    There's nothing unsafe about those characters. "A" is just as safe/unsafe. Your code *happens* to work in this specific case, but will not work for all encodings. It'll fail for UTF-16, for example.
      Hi. Thanks to you all for great feedback. I've gotten this working now. Cheers, Fro