in reply to Regexp and transliteration between languages

First, I know nothing about roman or the intricacies of the language so sorry if I totally miss what you are tying to do. From what it seems you want to do, this will work:
$chars = 'dny|kh|rj|ee|\w'; $foo = 'khatos'; push(@a, $1) while ($foo=~/($chars)/g); print join(' ', @a), "\n"; $foo = 'mukharjee'; push(@b, $1) while ($foo=~/($chars)/g); print join(' ', @b) , "\n";
Results:
kh a t o s
m u kh a rj ee

There may be a more elegant way though. The groupings in $chars is all the character combinations that should be considered a single term. $chars will obviously have to be expanded to include all the possible groups for the language, just keep the 3 character groups in the front then the 2 character groups.

Replies are listed 'Best First'.
RE: Re: Regexp and transliteration between languages
by nuance (Hermit) on Jun 15, 2000 at 17:13 UTC

    Roman isn't a language it's the alphabet. You may not know much about it but you used it to write your message :-)

    What is being writen here is a tool to allow you to type in another language using the standard "english" keyboard and convert groups of "english" (or more correctly roman) characters into the characters needed for the other language (which aren't on your keyboard)

    Nice solution BTW

    Nuance

      Roman isn't a language it's the alphabet.

      Doh! That was dumb, I knew that. Thanks for reminding me though.

      gnat's solution has the elegance I was looking for, nice.