GaijinPunch has asked for the wisdom of the Perl Monks concerning the following question:

Okay, having serious issues here. I've tried the following:

use Unicode::Japanese; my $word = Unicode::Japanese->new($word, 'euc-jp')->get; # mangled
And this
use Encode qw/encode decode/; $word = from_to( "euc-jp", $word ); #from_to subroutine doesn't exist $word = encode( "euc-jp", $word ); # mangled


Seems the output isn't encoded in any valid encoding. Starting to run out of options here.

Replies are listed 'Best First'.
Re: Converting from UTF8->EUC-JP
by lestrrat (Deacon) on Jul 04, 2005 at 02:44 UTC

    from_to is not exported by default. it also encode/decodes strings *in-place*.

    use Encode qw(from_to); from_to($word, 'utf8', 'euc-jp');

    On your second case, it depends. most likely cause is that $word isn't decoded into Perl's utf8 representation. You'd need to show more specific examples

      Very nice -- that did it. Much appreciated!
Re: Converting from UTF8->EUC-JP
by Anonymous Monk on Jul 04, 2005 at 02:48 UTC
    i'm just barely starting to understand encodings and how Perl handles them, so pardon the mess. i'm not sure why but this worked for me...
    $ perl -e 'print "\x30\x4b","\x00\x0a"' | iconv -f utf-16be -t utf-8 | + perl -MEncode=encode,decode -e '$x=<>;print encode("euc-jp",decode(" +utf8",$x))' | xxd 0000000: a4ab 0a ... $ gzcat /usr/share/i18n/charmaps/EUC-JP.gz | fgrep U304B <U304B> /xa4/xab HIRAGANA LETTER KA
    it looks like you need to inform Encode of the current encoding of your text somehow.