in reply to Sorting utf-8

You will have to create your own sorting algorithm. Default sort's cmp operator sorts on the 'ascii value' of the characters. Therefore é comes way after z.

Chapter 3.2.153 of 'Learning Perl' by O'Reilly covers using your own algorithms for sort. The essence is to have a subroutine return -1 if $a < $b, return 0 if $a == $b and return 1 if $a > $b, where $a and $b are the two array elements to compare.

You could perhaps make a hash like this:
my %characterOrder = ( 'a' => 1, 'â' => 1, 'ä' => 1, ... 'b' => 2, 'c' => 3, 'Ç' => 4, ... );
You can then compare the values of $characterOrder{$a} with $characterOrder{$b} likt this:
sub GoodSort { $characterOrder{$a} <=> $characterOrder{$b} }

Replies are listed 'Best First'.
Re: Re: Sorting utf-8
by Anonymous Monk on Apr 24, 2003 at 10:31 UTC
    Hi,

    Thanks for the reply. I thought of doing a listing, but then I realised that I need alphabets for Latin 1, Latin 2, Greek, Cyrilic and Maltese. If all else fails it is something to fall back on, but before that I wanted other thoughts/options.

    Anne