in reply to Sorting Vietnamese text
Update: Sorry, some errors in the code below. In particular, the constructor for the collator should be this.
Then the sort method will work as intended. Try it with actual Vietnamese words.my $Collator = Unicode::Collate::Locale->new(locale =>'vi');
Unicode::Collate::Locale ought to help. Example code below not using code tags due to display bug with utf8 text.
#!/usr/bin/env perl
use v5.14;
use warnings;
use utf8::all;
use Unicode::Collate::Locale;
my $Collator = Unicode::Collate::Locale->new('vi');
my @unsorted = qw(
a..7
ả..3
à..9
ạ..5
ã..4
á..1
ă..6
à..2
á..8
);
my @sorted = $Collator->sort(@unsorted);
say "unsorted\n@unsorted";
say "sorted\n@sorted";
Output is as follows.
unsorted a..7 ả..3 à..9 ạ..5 ã..4 á..1 ă..6 à..2 á..8 sorted á..1 à..2 ả..3 ã..4 ạ..5 ă..6 a..7 á..8 à..9
Update #2: The code below actually is a correct example.
#!/usr/bin/env perl
use v5.14;
use warnings;
use utf8::all;
use Unicode::Collate::Locale;
my $Collator = Unicode::Collate::Locale->new(locale =>'vi');
my @unsorted = ('á', 'ả', 'ã', 'à', 'ậ', 'ă', 'ạ', 'ẫ', 'a', 'ẩ' );
my @sorted = $Collator->sort(@unsorted);
say "unsorted\n@unsorted";
say "sorted\n@sorted";
Giving the output:
unsorted á ả ã à ậ ă ạ ẫ a ẩ sorted a à ả ã á ạ ă ẩ ẫ ậ
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Sorting Vietnamese text
by pdenisowski (Acolyte) on Dec 22, 2013 at 20:07 UTC | |
by farang (Chaplain) on Dec 22, 2013 at 23:48 UTC | |
by pdenisowski (Acolyte) on Dec 23, 2013 at 02:37 UTC | |
by farang (Chaplain) on Dec 23, 2013 at 04:28 UTC | |
by pdenisowski (Acolyte) on Dec 23, 2013 at 15:03 UTC | |
| |
by Jim (Curate) on Dec 23, 2013 at 03:35 UTC | |
by pdenisowski (Acolyte) on Dec 23, 2013 at 00:08 UTC |