I got this as output from my program.foreach $i (@row) { print(getcode($i), "\n"); $i++; }
Ah. Sorry, I should have pointed out earlier that there is a problem with that loop. You need to study Perl syntax a little more...
When you say for $i ( @row ) (or "foreach"), $i is being set to each successive value of @row on each iteration -- in other words, $i is not an array index, it is the value stored at each element of the array. So do not increment $i in that sort of situation, because it makes no sense to do that. (That's probably where the "1" is coming from.)
So on the first iteration through that loop, you are looking at the English field, which is presumably ascii data. You still need to figure out what encoding is being used in the latter two fields (Kana and Kanji). I gather that the "getcode" method in Jcode is supposed to return the encoding -- here's what the documentation says:
($code, $nmatch) = getcode($str)
Returns char code of $str. Return codes are as follows
ascii Ascii (Contains no Japanese Code)
binary Binary (Not Text File)
euc EUC-JP
sjis SHIFT_JIS
jis JIS (ISO-2022-JP)
ucs2 UCS2 (Raw Unicode)
utf8 UTF8
So this method should tell you what you need to know. I'll try again with a snippet suggestion:
binmode STDOUT, ":utf8"; # connect and run your query on Access db... then: my @row = $sth->fetchrow_array; my $eng = shift @row; # first field is English my $kana = shift @row; # second field is Kana my $kanji = shift @row; # third field is Kanji my $kana_enc = getcode( $kana ); my $kanji_enc = getcode( $kanji ); if ( $kana_encoding ne $kanji_encoding ) { warn "Very strange: kana is in $kana_enc, but kanji is in $kanji_e +nc\n"; } my $kana_utf8 = decode( $kana_enc, $kana ); my $kanju_utf8 = decode( $kanji_enc, $kanji ); printf( "English: %s Kana: %s Kanji: %s\n", $eng, $kana_utf8, $kanji +_utf8 );
You just said "this doesn't work"... You have to be more explicit. Show the actual code you used, including the modifications you made according to my suggestions (so I can see whether you actually did as I intended), and give some sort of definition for "doesn't work", in the sense of "I expected this: ... but got this: ..." -- that is, try to show some actual data.
(Saving the output to a file and viewing that with any sort of tool that shows byte-by-byte hex codes can be very helpful. On unix/linux and unix-tools-ported-to-windows, there's the "od" command, and just running "od -txC data.file" would do nicely.)
Please, a little more information about what you are dealing with, and what you've done with my earlier suggestion.
UPDATE: I just noticed that the strings returned by Jcode::getcode() might not work when passed to Encode::decode. You may need to add a hash that maps the Jcode strings to valid Encode designations:
my %code_map = ( euc => 'euc-jp', sjis => 'shiftjis', jis => 'iso-2022-jp', ucs2 => 'UCS-2LE', utf8 => 'utf8' ); # ... my $kana_enc = getcode( $kana ); # ... $kana_utf8 = decode( $code_map{$kana_enc}, $kana ); # ...
In reply to Re^3: MS Access Input -> Japanese Output
by graff
in thread MS Access Input -> Japanese Output
by Zettai
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |