Pimp my code - Schwartzian transform maybe?

GrandFather has asked for the wisdom of the Perl Monks concerning the following question:

The code below seems a bit clunky to me. It is generating the correct result. It's unlikely to be used with a huge list of tagIndexes, but the list could easily be a few hundred entrys long.

Part of cleaning it up is likely to involve a Schwartzian transform, but the flagValue sub seems clunky to me too. Any ideas for cleaning it up?

#!/usr/bin/perl

use warnings;
use strict;

my %tagTypes =
    (
    code => ['code', 'code', ''],
    bold => ['b', 'Bold', 'F'],
    teletype => ['tt', 'Teletype text', 'F'],
    id => ['link id://', 'Node id link', 'L'],
    readmore => ['readmore', 'Readmore', 'RB'],
    readmoretitle => ['readmoretitle', 'Readmore Title', 'RTB'],
    );
    
my @tagIndexes =
    (
    ['on', 'code', '1.0'],
    ['off', 'code', '3.6'],
    ['on', 'id', '3.9'],
    ['on', 'bold', '3.9'],
    ['off', 'id', '3.20'],
    ['off', 'bold', '3.20'],
    );
    
print join "\n", map {"$_->[0] $_->[1] $_->[2]"} sort modeOrder @tagIn
+dexes;

sub modeOrder {
    # Sort compare routine to order tags by markup generation order
    my ($aType, $aItem, $aIndex) = @$a;
    my ($bType, $bItem, $bIndex) = @$b;

    return $_ if ($_ = int ($aIndex) <=> int ($bIndex));
    
    $aIndex =~ s/\d*\.//;
    $bIndex =~ s/\d*\.//;
    
    return $_ if $_ = $aIndex <=> $bIndex;
    return $_ if $_ = $aType cmp $bType;

    my $aMode = ${$tagTypes{$aItem}}[2];
    my $bMode = ${$tagTypes{$bItem}}[2];
    
    return flagValue ($aMode) <=> flagValue ($bMode) if $aType eq 'on'
+;
    return flagValue ($bMode) <=> flagValue ($aMode); # off case
}

sub flagValue {
    my $flags = shift;
    
    return 1 if -1 != index $flags, 'R';
    return 2 if -1 != index $flags, 'T';
    return 3 if -1 != index $flags, 'B';
    return 4 if -1 != index $flags, 'I';
    return 5 if -1 != index $flags, 'F';
    return 6 if -1 != index $flags, 'L';
    return 7;
}
[download]

Prints:

on code 1.0
off code 3.6
on bold 3.9
on id 3.9
off id 3.20
off bold 3.20
[download]

Update: So far sub flagValue hasn't been addressed. The intent of the routine is to priority encode a sort value given a string that may contain multiple flags. For example a flag string may be 'BFXCU' which should encode as 3 using the original code.

Note that the actual values used are unimportant except that it would be good if the sort order were the same regardless of using <=> or cmp. The sort order must remain the same as the original code.

DWIM is Perl's answer to Gödel

Comment on Pimp my code - Schwartzian transform maybe? Select or Download Code

Replies are listed 'Best First'.
Re: Pimp my code - Schwartzian transform maybe? by bart (Canon) on May 06, 2006 at 11:52 UTC
Yes you can use the Schwartzian Transform. What you must do then, is preprocess the data so you no longer have to do those things in the comparison sub. I'm talking about stuff like `int ($aIndex)`, `$aIndex =~ s/\d*\.//`, `$aMode = ${$tagTypes{$aItem}}[2]` and `flagValue ($aMode)`. Each could be just a precaluted item in an anonymous array. Preprocessing these expressions can be done with code like this: `map { my ($type, $item, $index) = @$_; my($fract) = $index =~ /\.(\d+)/; my $mode = ${$tagTypes{$item}}[2]; my $flagValue = flagValue($mode); [ $_, # original, 0 $type, # 1 int($index), # 2 $fract, # 3 $mode, # 4 $flagValue, # 5 ]; } @tagIndexes` [download] The complete Schwartzian Transformed code then looks like this: `print join "\n", map {"$_->[0] $_->[1] $_->[2]"} map $_->[0], sort { $a->[2] <=> $b->[2] # int \|\| $a->[3] <=> $b->[3] # fract \|\| $a->[1] cmp $b->[1] # type \|\| ( $a->[1] eq 'on' ? $a->[5] <=> $b->[5] # flagValue : $b->[5] <=> $a->[5] ); # off case } map { my ($type, $item, $index) = @$_; my($fract) = $index =~ /\.(\d+)/; my $mode = ${$tagTypes{$item}}[2]; my $flagValue = flagValue($mode); [ $_, $type, int($index), $fract, $mode, $flagValue ]; } @tagIndexes;` [download] It appears to produce the same outcome for me.	[reply] [d/l] [select]
Re: Pimp my code - Schwartzian transform maybe? by salva (Canon) on May 06, 2006 at 13:36 UTC
it gets much simpler (and faster) using Sort::Key: `use Sort::Key::Multi qw(i3_keysort); # i3_keysort => sorter sub that expects # three integer keys print join "\n", map "$_->[0] $_->[1] $_->[2]", i3_keysort { my ($type, $item, $index) = @$_; my ($ix1, $ix2) = $index =~ /^(\d+)\.(\d+)$/; my $flag = flagValue($tagTypes{$item}[2]); $flag = -$flag if $type eq 'off'; ($ix1, $ix2, $flag) } @tagIndexes;` [download] note that the type and flag properties can be easily collapsed on an unique index.	[reply] [d/l]
Re^2: Pimp my code - Schwartzian transform maybe? by salva (Canon) on May 07, 2006 at 13:14 UTC
and moving things around: `use Sort::Key::Multi 'i3_keysort'; { my %sign = (on => 1, off => -1); sub flagsValue { my ($type, $item) = @_; my $flags = $tagTypes{$item}[2]; my $i = 1; for (qw(R T I F B L)) { last if index $flags, $_ >= 0 $i++ } $sign{$type} * $i; } } print join "\n", map "$_->[0] $_->[1] $_->[2]", i3_keysort { $_->[2] =~ /^(\d+)\.(\d+)$/, flagsValue($_->[0], $_->[1]) } @tagIndexes;` [download]	[reply] [d/l]
Re: Pimp my code - Schwartzian transform maybe? by ikegami (Patriarch) on May 06, 2006 at 20:13 UTC
Even better? `{ my @restore; print "$_->[0] $_->[1] $_->[2]\n" foreach map { $restore[unpack('N', substr($_, -4))] } sort map { my ($type, $item, $index) = @$_; push(@restore, $_); my ($int, $fract) = split(/\./, $index); my $flagValue = flagValue($tagTypes{$item}[2]); $flagValue = -$flagValue if $type eq 'off'; pack('NNCN', int($index), $index =~ /\.(\d+)/, $flagValue & 0xFF ^ 0x80, $#restore, ) } @tagIndexes; }` [download] Points of interest: Negating `$flagValue` when `$type eq 'off'` removes the need for `$aType cmp $bType` and for `if $aType eq 'on'` The above uses the default compare function (`$a cmp $b`) for ~~the win!~~ a big speed boost. `${$tagTypes{$aItem}}[2]`??? `$tagTypes{$aItem}[2]`!!! Bonus: It's coincidently a stable sort. Update: Upgraded the sort key generation code from the following presumably slower code: `join '', pack('N', int($index)), pack('N', $index =~ /\.(\d+)/), sprintf('%02d', $flagValue), pack('N', $#restore)` [download]	[reply] [d/l] [select]
Re: Pimp my code - Schwartzian transform maybe? by GrandFather (Saint) on May 07, 2006 at 11:38 UTC
Heh, here's one way to do it that avoids the string of if statements: `sub flagValue { my ($flags, $item) = @_; my $flagOrder = ' RTBIFL'; (-1 != index $flags, substr $flagOrder, $_, 1) and return $_ for 1..length($flagOrder); }` [download] DWIM is Perl's answer to Gödel	[reply] [d/l]
Re: Pimp my code - Schwartzian transform maybe? by wfsp (Abbot) on May 07, 2006 at 12:21 UTC
Here's my go at the `flagValue` sub. It uses a lookup but also inverts the logic so that an 'R' will return a high number (yours returned 1). Any `$flags` string that contains an 'R' will be higher than any combination without an 'R' and so on for the others. You would need to reflect the change in your sort routine. I would consider processing these values beforehand maybe adding them to `@tagIndexes`. `#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %flagvalues = ( R => 26, T => 25, B => 24, I => 23, F => 22, L => 1, ); my $flags = 'RRTBLXUC'; print flagValue($flags); sub flagValue{ $flags = shift; my $weight; my %unique_flags = map { $_ => undef } split //, $flags; for my $flag (keys %unique_flags){ $weight += exists $flagvalues{$flag} ? $flagvalues{$flag} : 0; # print "$flag -> $weight\n"; } return $weight; }` [download] Update: It also relies on any flag appearing only once. Update 2:** Updated to only weigh a flag once.	[reply] [d/l] [select]
Re^2: Pimp my code - Schwartzian transform maybe? by GrandFather (Saint) on May 08, 2006 at 12:05 UTC
Interesting. A little tweaking and we come up with the following which preserves the ordering and priority encoding nature of the original function. Note in particular that the addition is replaced by a bitwise or - that's where the priority encoding bit happens (it is also unaffected by duplicate flags). The bitwise complements preserve the original sort order. `my %flagvalues = ( R => ~(21 - 1), T => ~(22 - 1), B => ~(23 - 1), I => ~(24 - 1), F => ~(25 - 1), L => ~(26 - 1), ); sub flagValue { my $weight = ~(2**7 - 1); for (split //, shift){ $weight \|= $flagvalues{$_} if exists $flagvalues{$_}; } return ~$weight; }` [download] DWIM is Perl's answer to Gödel	[reply] [d/l]