I got this as output from my program.
foreach $i (@row) { print(getcode($i), "\n"); $i++; }

Ah. Sorry, I should have pointed out earlier that there is a problem with that loop. You need to study Perl syntax a little more...

When you say for $i ( @row ) (or "foreach"), $i is being set to each successive value of @row on each iteration -- in other words, $i is not an array index, it is the value stored at each element of the array. So do not increment $i in that sort of situation, because it makes no sense to do that. (That's probably where the "1" is coming from.)

So on the first iteration through that loop, you are looking at the English field, which is presumably ascii data. You still need to figure out what encoding is being used in the latter two fields (Kana and Kanji). I gather that the "getcode" method in Jcode is supposed to return the encoding -- here's what the documentation says:

       ($code, $nmatch) = getcode($str)
         Returns char code of $str. Return codes are as follows

          ascii   Ascii (Contains no Japanese Code)
          binary  Binary (Not Text File)
          euc     EUC-JP
          sjis    SHIFT_JIS
          jis     JIS (ISO-2022-JP)
          ucs2    UCS2 (Raw Unicode)
          utf8    UTF8
So this method should tell you what you need to know. I'll try again with a snippet suggestion:
binmode STDOUT, ":utf8"; # connect and run your query on Access db... then: my @row = $sth->fetchrow_array; my $eng = shift @row; # first field is English my $kana = shift @row; # second field is Kana my $kanji = shift @row; # third field is Kanji my $kana_enc = getcode( $kana ); my $kanji_enc = getcode( $kanji ); if ( $kana_encoding ne $kanji_encoding ) { warn "Very strange: kana is in $kana_enc, but kanji is in $kanji_e +nc\n"; } my $kana_utf8 = decode( $kana_enc, $kana ); my $kanju_utf8 = decode( $kanji_enc, $kanji ); printf( "English: %s Kana: %s Kanji: %s\n", $eng, $kana_utf8, $kanji +_utf8 );

You just said "this doesn't work"... You have to be more explicit. Show the actual code you used, including the modifications you made according to my suggestions (so I can see whether you actually did as I intended), and give some sort of definition for "doesn't work", in the sense of "I expected this: ... but got this: ..." -- that is, try to show some actual data.

(Saving the output to a file and viewing that with any sort of tool that shows byte-by-byte hex codes can be very helpful. On unix/linux and unix-tools-ported-to-windows, there's the "od" command, and just running "od -txC data.file" would do nicely.)

Please, a little more information about what you are dealing with, and what you've done with my earlier suggestion.

UPDATE: I just noticed that the strings returned by Jcode::getcode() might not work when passed to Encode::decode. You may need to add a hash that maps the Jcode strings to valid Encode designations:

my %code_map = ( euc => 'euc-jp', sjis => 'shiftjis', jis => 'iso-2022-jp', ucs2 => 'UCS-2LE', utf8 => 'utf8' ); # ... my $kana_enc = getcode( $kana ); # ... $kana_utf8 = decode( $code_map{$kana_enc}, $kana ); # ...

In reply to Re^3: MS Access Input -> Japanese Output by graff
in thread MS Access Input -> Japanese Output by Zettai

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.