Akira71 has asked for the wisdom of the Perl Monks concerning the following question:
This is getting serious. AS you know I posted a question a few days ago about using Perl 5.6, Oracle 8i and Japanese scripts on here. I got some excellent suggestions and followed through and now I am having what seems to be an even more servious issue. This goes down to base encoding problems I imagine and might be a little off-topic for Perlmonks, but everyone here has been extremely helpful and knowledgable so here it goes.
It started out with an app written in Java and run from a browser. It connects to an Oracle8i database. The Oracle database is set for UTF8. The web browser (tested in Netscape, Opera, IE) are all set to shift-JIS as is the encoding in the HTML and JSP pages. I am assuming we are writing raw Shift-JIS data straight into the UTF-8 Database. When we use the web browser based application to read from said database and output to screen we get the correct Japanese everywhere.
Here is the real problem. When we do the same with Perl 5.6 we get garbage everwhere. I was able to use the suggestions from the Perlmonks in my last thread to set my environment variables all correctly and ensure everything was working. As a matter of fact, if I put the output of the Perl program into a ANSI text file and add HTML HEAD and BODY headers then the output is nearly perfect Japanese (I say nearly as some chracters are dropped. That is another issue.)
However, even though I can output raw UTF-8, I cannot make sense of any of the data unless it is output to browser. I am using TOAD and various other tools ot inspect the data in the database and as suspected (even on Japanese O/S) the data appears to be garbage ASCII characters. This is fine. The output from Perl is the exact same set of garbage charatcers, but of course with the bytes intact. What I need to know is how we can make this file viewable on a Japanese O/S without resorting to using a browser.
The Browser does correctly render the output and states it as Shift-JIS. I have tried a few Perl modules to say output UTF8 (raw works browser only), ShiftJIS(never works in anything), JIS or any other encoding it always fails. I have not been able to produce a single readable text document from Perl. Only UTF-8 raw output, auto-detected to Shift-JIS and in browser works. I am doing this form a terminal connected to a Solaris box and I know the output it is giving me from the database is correct.
Does anyone with experience in Japanese apps, Perl and Unix have any further ideas as to why I cannot output a delemited text file in Japanese. This is more than an issue of having the correct fonts installed on Unix. I only need to view the output in a text document on my Japanese PC here.
In Japan I have only used Java, C++ and JPerl and natively. I have never mixed them from a US server and web apps with Perl reporting backends with several language formats.
I am very much appreciative for any leads on this. I simply hate to be lost on a topic to this extent.,
Akira
P.S. Please forgive my less than adequate English or any misspellings. This takes much time for me to write and format correctly.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: UTF8, Perl 5.6 and Oracle Revisited....
by rdfield (Priest) on Oct 10, 2002 at 13:25 UTC | |
by Akira71 (Scribe) on Oct 10, 2002 at 13:35 UTC | |
by rdfield (Priest) on Oct 10, 2002 at 13:58 UTC | |
|
Re: UTF8, Perl 5.6 and Oracle Revisited....
by l2kashe (Deacon) on Oct 11, 2002 at 04:17 UTC | |
|
Re: UTF8, Perl 5.6 and Oracle Revisited....
by samgold (Scribe) on Oct 11, 2002 at 05:19 UTC |