in reply to Generating UTF-8 from nasty high ASCII input

If you really need to convert to UTF-8 (and I would try IlyaM's suggestion first) , you might be interested in the Encode module. It's in the 5.8.0 distribution.

from the manpage:

$string = decode(ENCODING, $octets [, CHECK]) Decodes a sequence of octets assumed to be in ENCODING into Perl's internal form and returns the resulting string. As in encode(), ENCODING can be either a canon- ical name or an alias. For encoding names and aliases, see "Defining Aliases". For CHECK, see "Handling Mal- formed Data".

I've only played with this a little, but 5.8 (5.8.0 RC2 that is) seems to be a lot more stable when you use utf8; so it might be your best best.

-- Joost downtime n. The period during which a system is error-free and immune from user input.

Replies are listed 'Best First'.
Re: Re: Generating UTF-8 from nasty high ASCII input
by samtregar (Abbot) on Jul 10, 2002 at 16:32 UTC
    Do you know of any reason that this would work where Unicode::Map8 and umap didn't? My impression is that they perform the same task.

    -sam