sub make_sort_order {
my $str = shift;
$str =~
tr(aáàảãạăaáàảãạăắằẳẵặâấầẩẫậbcdđeéèẻẽẹêếềểễệfghiíìỉĩịjklmnoóòỏõọôốồổỗộơớờởỡợpqrstuúùủũụưứừửữựvwxyýỳỷỹỵz)
(00000011111111111112222223456777777888888abcddddddefghijjjjjjkkkkkkllllllmnopqrrrrrrsssssstuvwwwwwwx)d;
return $str;
}
my @words = ('ầm', 'ãm', 'ấm chè', 'ám số');
print $_->[1], "[n" for
sort { $a->[0] cmp $b->[0] || $a->[1] cmp $b->[1] }
map { [ make_sort_order($_), $_ ] } @words;
It's still missing a correct 'secondary sort' (for the edge case when the diacritic-stripped words are identical); it should not be difficult to add once someone figures out a suitable transliteration that sorts asciibetically.
In reply to Re^2: Sorting Vietnamese text
by Anonymous Monk
in thread Sorting Vietnamese text
by pdenisowski
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |