In essence, all you need is a custom string comparison routine. And the easiest, most efficient way to do that in Perl is to use the tr/// operator to 'encode' the strings and then use the standard cmp operator.
Say your custom sort rules call for 0-4 to be sorted before alphas, and 5-9 after. And within the alphas, you want upper and lower case of any given character to be sorted together. Then you set up a mapping that maps the original strings to characters that will sort in the required order.
For the given example:
will do the trick. Now, you just transliterate your string and sort in the normal way:tr[0-4AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz5-9][\x00-\x +ff];
#! perl -slw use strict; sub trans { my $in = shift; $in =~ tr[0-4AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz5 +-9] [\x00-\xff]; return $in; } chomp( my @data = <DATA> ); my @sorted = sort{ trans( $a ) cmp trans( $b ) } @data; print for @sorted; __DATA__ cdef 0123456 abcd 50011 ABCD 4999 Zxyw CDEF zxyw 9999
Produces:
c:\test>junk78 0123456 4999 ABCD abcd CDEF cdef Zxyw zxyw 50011 9999
Of course, you can now apply all the usual forms of sort optimisations--ST, GRT etc.--to that, but the mechanism remains the same.
The ugly head of UTF will probably complicate things a little, but, at least in the eyes of my non-UTF aware brain, it should be possible to apply the same mechanism.
In reply to Re: New Alphabet Sort Order
by BrowserUk
in thread New Alphabet Sort Order
by Polyglot
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |