I need to change the ASCII collating sequence so that all non-alphanumerics sort together before numerics and alphas, and upper and lower cases are kept together. I thought that I could have the sort function's comparator transliterate @a and @b for the purpose of comparison, but it's failing and I don't see why.
#!/usr/bin/env perl use 5.010; use warnings; use strict; say <<'EOF'; .,-;:!?"'`_#$%&*+/|=@\^~()<>[]{}0123456789AaBbCcDdEeFfGgHhIiJjKkLlMmN +nOoPpQqRrSsTtUuVvWwXxYyZz intended sequence !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcde +fghijklmnopqrstuvwxyz{|}~ natural ASCII sequence EOF my @list = qw{ "Hello" Abel (hello) {adieu} @adieu [goodbye] Charlie ^Charlie ~Adieu zebra 21708 baker . - ; : ! ? " ' ` _ }; my @sorted_list = sort { # for each $a:$b comparison, transliterate $a and $b into temp + vars $x and $y my ($x,$y) = map { my $z = $_ =~ tr / .,-;:!?"'`_#$%&*+\/|=@\\^~()<>[]{}0123456789AaBbCcDd +EeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz / !"#$%&'()*+,-.\/0123456789:;<=>?@ABCDEFGHIJKLMNOPQRS +TUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~ /r; # don't smash the original $a and $b, instead assi +gn to $z #printf "%-10s -> %-10s\n", $_, $z; $z; } ($a,$b); # transliterate $a and $b into $x and $y for sortin +g show_compare($a, $b, $x, $y); $x cmp $y } @list; say "\nSorted list"; say for @sorted_list; sub show_compare { my ($a, $b, $x, $y) = @_; printf "%-10s %-10s %s %-10s %-10s\n", $a, $x, (qw/< = >/)[1+($x c +mp $y)], $b, $y; }
Output:
.,-;:!?"'`_#$%&*+/|=@\^~()<>[]{}0123456789AaBbCcDdEeFfGgHhIiJjKkLlMmN +nOoPpQqRrSsTtUuVvWwXxYyZz intended sequence !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcde +fghijklmnopqrstuvwxyz{|}~ natural ASCII sequence [Show @a and $b and their transliterated equivalents] "Hello" 5faoou5 < Abel X[ao (hello) FgaoouG < {adieu} LY_ia M @adieu BY_ia < [goodbye] Jeuu_[ aK Charlie \gY{oia > ^Charlie D\gY{oia ~Adieu EX_ia > zebra a[{Y 21708 ('-&. < baker [Yma{ . ! < - # ; 1 > : 0 ! 3 < ? 4 " 5 < ' 6 ` 7 < _ 8 "Hello" 5faoou5 < (hello) FgaoouG (hello) FgaoouG < Abel X[ao Abel X[ao > {adieu} LY_ia M "Hello" 5faoou5 < @adieu BY_ia @adieu BY_ia < (hello) FgaoouG (hello) FgaoouG < [goodbye] Jeuu_[ aK [goodbye] Jeuu_[ aK < {adieu} LY_ia M ^Charlie D\gY{oia > zebra a[{Y ^Charlie D\gY{oia < ~Adieu EX_ia Charlie \gY{oia > ~Adieu EX_ia zebra a[{Y < 21708 ('-&. 21708 ('-&. < ^Charlie D\gY{oia ^Charlie D\gY{oia < baker [Yma{ baker [Yma{ > ~Adieu EX_ia baker [Yma{ < Charlie \gY{oia "Hello" 5faoou5 > zebra a[{Y "Hello" 5faoou5 > 21708 ('-&. "Hello" 5faoou5 < ^Charlie D\gY{oia @adieu BY_ia < ^Charlie D\gY{oia ^Charlie D\gY{oia < (hello) FgaoouG (hello) FgaoouG > ~Adieu EX_ia (hello) FgaoouG < baker [Yma{ [goodbye] Jeuu_[ aK < baker [Yma{ baker [Yma{ > {adieu} LY_ia M baker [Yma{ > Abel X[ao . ! < : 0 : 0 > - # . ! < ! 3 ! 3 > - # ! 3 > : 0 ! 3 > ; 1 " 5 < ` 7 ` 7 > ' 6 . ! < " 5 " 5 > - # " 5 > : 0 " 5 > ; 1 " 5 > ! 3 " 5 > ? 4 zebra a[{Y < . ! . ! < 21708 ('-&. 21708 ('-&. > - # 21708 ('-&. < : 0 "Hello" 5faoou5 > : 0 "Hello" 5faoou5 > ; 1 "Hello" 5faoou5 > ! 3 "Hello" 5faoou5 > ? 4 "Hello" 5faoou5 > " 5 "Hello" 5faoou5 < ' 6 @adieu BY_ia > ' 6 @adieu BY_ia > ` 7 @adieu BY_ia > _ 8 Sorted list zebra . - 21708 : ; ! ? " "Hello" ' ` _ @adieu ^Charlie ~Adieu (hello) [goodbye] {adieu} Abel baker Charlie
It appears that tr is not transliterating in the way I expected it to. Can someone spot the error? Is there a better way to do this?

In reply to Changing ASCII collating sequence for sort by ibm1620

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.