I am contemplating how difficult or otherwise it would be to implement multilingual applications in Perl. In particular, if there were string literal objects that were used for all messages, this would allow global language switching, in a similar manner to how locale does it for collating sequences and formatting of numerics.

Has anybody done or seen this? Anybody aware of commercial code that does this (I am aware of a commercial system that uses this technique, but not in Perl)?

I read with interest this article by maverick about translation on the CB.

Another thought is how to take an existing app and make it multilingual. I envisage an equivalent of taint mode, flagging all untranslated string literals. The application can thus generate warnings about untranslated text - or even babelfish it on the fly.

I think that this would be a boon for international website developers. The HTML content could be in static files with a different file for each language, but the dynamic stuff (shopping trolleys etc.), produced by code, needs to come up in the user's language.

Thoughts please. Is this doable?

Replies are listed 'Best First'.
Re: Multilingual Perl applications
by derby (Abbot) on Mar 25, 2002 at 21:04 UTC
    rinceWind,

    As having done this with sites, let me tell you it's pretty hard to do well. Maybe our site just did it wrong but we would tag all text as "translatable." If the user was viewing the site in english, no big deal we would just output what was inside the tags. If they were viewing it in another language, we would extract the translated data from dictionary files. You could do a dynamic bablefish here. With a good caching scheme, only the first viewer paid the price for translation.

    No the hard part is how much to tag ... where to tag ... etc. Translating by sentences can lead to some really fractured paragraphs and translating at paragraphs can lead to some really fractured pages. Translations also play havoc with layout. A simple four letter word in language A can turn into a 40 character behemoth in language Z. Maybe better use of stylesheets could solve that problem.

    So doable? Yes. Doable well? Probably not. Doable enough for speakers of other languages to understand? Yes. Doable enough for speakers of other languages to be wow-ed? Probably not.

    -derby

Re: Multilingual Perl applications
by Anonymous Monk on Mar 25, 2002 at 20:10 UTC
    You probably want to have a look at Locale::MakeText. I'll probably be doing some work to integrate HTML::Mason and Locale::MakeText for RT (http://bestpractical.com/rt/) pretty soon
Re: Multilingual Perl applications
by oakbox (Chaplain) on Mar 26, 2002 at 07:36 UTC
    Golly, I just finished making one of my programs multi-lingual, so I read this post at JUST the right time. I don't know if this is the BEST solution:) But it seems to be working for me.

    First, I must admit that I do not speak every language on the planet. Knowing this, I wanted to put the translation engine into a script that would allow other, non-programmers, to update the language sections that they have expertise in and keep me from having to re-write the code every time.

    One solution I saw, was to use translation files that the script would call. Each file would have a list of variable definitions

    $language{'submit_button_text}="Submit"; $language{'web_title_text'}="programmercentral.org";
    etc...

    The problem with this is that it's a pain for your translator to view the 'original' text and produce clean, translated text. It's a very labor-intensive task and means that you (the maintainer) have to spend a lot of time looking over the translator's shoulder to make sure that they don't break syntax.

    So, here was my solution (finally): I put the text for the web site into a MySQL database. I tried to make it as atomic as possible and rely heavily on CSS to handle my formatting. In this way, the translation information itself is free of HTML markup and requires NO expertise for the translator.

    Next, I set up a simple web interface (behind .htaccess) that lets my officially sanctioned translator to open a web page and display any current language information, as well as blanks for his new language side by side. He/she just walks down the form and translates whatever is in the left box to the new language in the right box. This solves the expertise problem and makes it easy for any off the shelf translation professional to update your site.

    Now, within the script itself, I simply check for which language I should be looking at, dump the contents of the language to a hashref, then look in the hashref whenever it is time to display text to the screen. From this point forward, all I need to do to update the whole site is add a column to my language table and find a translator that speaks one of the current languages and the target language. After that, update a couple of tags making the new language 'live' on the site and PRESTO, new language. This has worked VERY well and my customer has been pleased with the solution too.

    My two Euro-cents,
    -oakbox