crenz has asked for the wisdom of the Perl Monks concerning the following question:

So far I've been using a home-brewn Perl-in-HTML interpreter to realize multilingual webpages. For example, I'll have URLS like

http://server/de/page.html http://server/en/page.html

I then configured mod_rewrite to redirect these to the "true" file and set an environment variable LANG=de (or LANG=en). In my HTML file, I write

bla bla bla <!en>English page only <!de>Nur auf der deutschen Seite <!>This is gonna appear in both versions again again

Now, I'm looking to reimplement the system using something more standardized, feature-rich and tested (my C code realizing this interpreter is a mess...). I'm thinking of using Mason. However, I don't want to do lengthy checks like if ($ENV{LANG} eq "en" { ..... The point of the system is to have a short command so that I can freely intermix different languages, making it more easy to translate a webpage.

Would I have to modify Mason's parser to realise concise tags like <!de>, or can someone come up with a nice way to implement something like this more easily?

Replies are listed 'Best First'.
Re: Multilingual Mason
by valdez (Monsignor) on May 03, 2003 at 11:58 UTC

    One way I see is to extend Mason syntax using the preprocess option of HTML::Mason::Compiler. Mason authors suggest instead to extend the compiler; there is an example of such method in chapter 6 of Embedding Perl in HTML with Mason.

    On the other hand, I don't understand why you don't have separate pages for every language you need to display. Isn't it better to use Mason components and override portions of your pages locally? Or you can extend HTML::Mason::Resolver and make it search components in different paths for every language (refer to the aforementioned chapter 6 of the Mason book).

    HTH, Valerio

      Thanks for the suggestions! I will check them out.

      I don't understand why you don't have separate pages for every language you need to display.

      Different component paths for different languages make lots of sense -- it's a good idea that I will use for headers, footers etc.

      However, I previously (years ago) used a system that used different files for different languages, and it's a maintenance nightmare. It's easy to do the initial translation: Just copy en/some.html over to de/some.html and replace all English text with the German translation. However, to make changes, you first change it in one version, then you open the other file, search all the way down to the place where you want to change something, insert the new translation, forget half of the new stuff that you wrote, forget the other change that you made etc.

      I'm much happier with the current approach -- it saves me a lot of time and helps me prevent stupid mistakes.

        most of the multi-lingual sites that i've seen (not many :) use some kind of macro/make pre-formatter that adjusts the "code" only of the other language versions after you edit the code of say English. this makes for very boring code, as fancy code is kept well away from (language) string assigment, but it does keep things in order. btw use posix character class translations for any s/// m// tr/// etc. a-z --> A-Z doesn't fare that well on Chinese characters. Chris
Re: Multilingual Mason
by Anonymous Monk on May 05, 2003 at 08:19 UTC
    if the user has setup a preferred language in his browser, you can use "page.en.html" and "page.de.html" etc to seperate them.

      This is not relevant to the problem at hand. You are talking about how to decide which page to serve to the user. That is a question of server configuration, and I already solved that. (FWIW, I don't use auto-negotiation, for several reasons.)