wcruz has asked for the wisdom of the Perl Monks concerning the following question:

Is there any step by step tutorial on how to create and include a collation table for sorting text of a new language (latin extended charcters)? I am a novice and am not sure how to use the PERL::COLLATE thing. Please help. wcruz

Replies are listed 'Best First'.
Re: Creating a new unicode collation table
by graff (Chancellor) on Aug 21, 2007 at 04:32 UTC
    There is no "PERL::COLLATE". Are you maybe referring to Unicode::Collate? (Module Names are Case Sensitive). I haven't tried playing with it yet, but looking at the manual, it seems that a lot of preparation work is needed, with lots of things to look up regarding collation in the unicode standards as published at www.unicode.org, in order to initialize a new Collator object correctly.

    There's likely to be a standard collation table already defined for the language/characters you are using. Have you looked at perllocale yet? (Does your OS and perl installation support locales? The perllocale manual explains how to check that.)

    What "new language" are you dealing with?

      Thanks for the reply graf, Sorry, I meant Unicode::Collate. I am working on one of the languagues of Nicobars, India, which uses a modified Roman script. I need to get characters like 'e breve' before 'e', 'o acute' after 'o' etc to index a list of words in the native alphabetical order. wcruz