in reply to Apache+PerlCGI: accent problems
The solution is to use Encode; and is multi-fold:
Note that some of these step may be omitted if the incoming and outgoing character encodings are the same.
As a first step, try to figure out what encoding the java program is expecting as input and also what it is producing as output. For instance, try something like:
use Encode; my $word = "naġ Exists"; my $encoded_word = Encode::encode('utf8', $word); $out1=(`java -classpath /usr/local/lib/CS.jar csearch/CorpusSearch 'HT +MLQ(($encoded_word))' c_006_pos.txt.cs`); print $out1;
and see what $out1 looks like. You can use Firefox to do this --just use View -> Character Encoding -> More Encodings -> Unicode to try some different encodings out. If utf8 doesn't work, try 'utf16' which is another popular encoding to use with java.
After you've figured out the java part, then you should decide on an output encoding (either latin1 or utf8), add a charset parameter to your Content-type header, and use Encode::encode to encode the output, e.g.:
print "Content-type: text/html; charset=utf-8 ... set $out1 from java program ... print Encode::encode('utf8', $out1);
|
|---|