I'm working on a programming project where the code is documented in Japanese (S-JIS). Since most of us here can't read Japanese, we've been trying a lot of things to get it translated. My IDE is CodeWright, which conveniently includes a Perl interpreter for macros. My hope is to be able to highlight a Japanese string and translate it on the fly. I'm attempting to write a macro for it eventually, but for now I'm writing a standalone script:
#!perl -w use strict; use Jcode; my $DEBUG = 1; my $text = ''; for(<>) { if($DEBUG) { my ($code) = getcode($_); print "Chunk encoded as: " . $code . "\n"; } my $j = Jcode->new($_); $text .= $j->utf8 . "\n"; } print "\nText to send:\n" . $text . "\n" if $DEBUG; print "\nConnecting to translator... please wait.\n\n"; use WWW::Babelfish; my $obj = new WWW::Babelfish(); die( "Babelfish server unavailable\n" ) unless defined($obj); print "\nTranslating... this may take a loooong time.\n\n"; my $english = $obj->translate( source => 'Japanese', destination => 'English', text => $text, delimiter => '\n', ); print "\nTranslation: \n\n"; print $english; print "\n";
If it were an ASCII-friendly language like French, or German, I wouldn't have any trouble. But since it's Japanese, I figured I'd have to meddle with the encoding and put it in UTF-8 for Babelfish... I used Jcode to do this, but I'm not sure WWW::Babelfish is robust enough to handle the multi-byte encodings... any pointers would be appreciated...

--isotope
http://www.skylab.org/~isotope/

In reply to Translating Japanese to English with WWW::Babelfish by isotope

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.