in reply to Golf: Arbitrary Alphabetical Sorting

Well, here's a cut at it, even though it's not golfed yet:
sub o { my ($a, $w) = @_; my %m; @m{@$a} = (a..z); my $k = join "", keys %m; my $v = join "", values %m; map { eval "tr/$v/$k/"; $_ } sort map { eval "tr/$k/$v/"; $_ } @$w; }

-- Randal L. Schwartz, Perl hacker


update: and of course, it's doggy slow. That eval needs to be saved, like in a coderef or something.
update2: Bleh. RIght. 'a'..'z' was being presumptive that "letter" meant "letter" like I knew it. Here's the patch to make it work for larger sets, for reasonable values of work:
sub o { my ($a, $w) = @_; my %m; @m{@$a} = 1..@$a; my $k = join "", map {sprintf "\\%03o", ord $_} keys %m; my $v = join "", map {sprintf "\\%03o", $_ } values %m; map { eval "tr/$v/$k/"; $_ } sort map { eval "tr/$k/$v/"; $_ } @$w; }

Replies are listed 'Best First'.
Re: Re: Golf: Arbitrary Alphabetical Sorting
by Masem (Monsignor) on May 09, 2001 at 20:05 UTC
    While your $v/$k order is right, the one gotcha that I put in there was that the alphabet size was arbitrary, and not necessarily 26 characters.. it could be as many as 100, 1000, or more (Well, anything in non-Unicode above 255 makes no sense). So this trick doesn't work here.
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
      Well, if it's a single "character", tr likes it. What's the problem? About the only thing that's messy is the use of a slash and dash, and I could probably hork that somehow by always mapping to an octal-ish escape.

      I've reread the question three times now, and I don't see how you are defining "character" in any way other than something that tr can wrangle. If so, what's the structure of a "word" then? It's no longer a string, which would be a sequence of "characters" that tr can handle!

      -- Randal L. Schwartz, Perl hacker

        In your code, you prep your translation hash %m with keys from the input alphabet, and values of a through z. What if the input alphabet was [a..z1], or [a..z0-9]? These are more than 26 characters and thus, the 27th and higher characters will have null values for their keys, and will mess up your tr-and-back transformation.
        Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
        But what if the alphabet has 100 characters? Then, %m will have a through z, undef, undef, undef... as values, unless I'm missing something, which I'm pretty sure I'm not. That will lead to incorrect handling of the characters "above" 26.