GrandFather has asked for the wisdom of the Perl Monks concerning the following question:

The code below seems a bit clunky to me. It is generating the correct result. It's unlikely to be used with a huge list of tagIndexes, but the list could easily be a few hundred entrys long.

Part of cleaning it up is likely to involve a Schwartzian transform, but the flagValue sub seems clunky to me too. Any ideas for cleaning it up?

#!/usr/bin/perl use warnings; use strict; my %tagTypes = ( code => ['code', 'code', ''], bold => ['b', 'Bold', 'F'], teletype => ['tt', 'Teletype text', 'F'], id => ['link id://', 'Node id link', 'L'], readmore => ['readmore', 'Readmore', 'RB'], readmoretitle => ['readmoretitle', 'Readmore Title', 'RTB'], ); my @tagIndexes = ( ['on', 'code', '1.0'], ['off', 'code', '3.6'], ['on', 'id', '3.9'], ['on', 'bold', '3.9'], ['off', 'id', '3.20'], ['off', 'bold', '3.20'], ); print join "\n", map {"$_->[0] $_->[1] $_->[2]"} sort modeOrder @tagIn +dexes; sub modeOrder { # Sort compare routine to order tags by markup generation order my ($aType, $aItem, $aIndex) = @$a; my ($bType, $bItem, $bIndex) = @$b; return $_ if ($_ = int ($aIndex) <=> int ($bIndex)); $aIndex =~ s/\d*\.//; $bIndex =~ s/\d*\.//; return $_ if $_ = $aIndex <=> $bIndex; return $_ if $_ = $aType cmp $bType; my $aMode = ${$tagTypes{$aItem}}[2]; my $bMode = ${$tagTypes{$bItem}}[2]; return flagValue ($aMode) <=> flagValue ($bMode) if $aType eq 'on' +; return flagValue ($bMode) <=> flagValue ($aMode); # off case } sub flagValue { my $flags = shift; return 1 if -1 != index $flags, 'R'; return 2 if -1 != index $flags, 'T'; return 3 if -1 != index $flags, 'B'; return 4 if -1 != index $flags, 'I'; return 5 if -1 != index $flags, 'F'; return 6 if -1 != index $flags, 'L'; return 7; }

Prints:

on code 1.0 off code 3.6 on bold 3.9 on id 3.9 off id 3.20 off bold 3.20

Update: So far sub flagValue hasn't been addressed. The intent of the routine is to priority encode a sort value given a string that may contain multiple flags. For example a flag string may be 'BFXCU' which should encode as 3 using the original code.

Note that the actual values used are unimportant except that it would be good if the sort order were the same regardless of using <=> or cmp. The sort order must remain the same as the original code.


DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re: Pimp my code - Schwartzian transform maybe?
by bart (Canon) on May 06, 2006 at 11:52 UTC
    Yes you can use the Schwartzian Transform. What you must do then, is preprocess the data so you no longer have to do those things in the comparison sub. I'm talking about stuff like int ($aIndex), $aIndex =~ s/\d*\.//, $aMode = ${$tagTypes{$aItem}}[2] and flagValue ($aMode). Each could be just a precaluted item in an anonymous array.

    Preprocessing these expressions can be done with code like this:

    map { my ($type, $item, $index) = @$_; my($fract) = $index =~ /\.(\d+)/; my $mode = ${$tagTypes{$item}}[2]; my $flagValue = flagValue($mode); [ $_, # original, 0 $type, # 1 int($index), # 2 $fract, # 3 $mode, # 4 $flagValue, # 5 ]; } @tagIndexes
    The complete Schwartzian Transformed code then looks like this:
    print join "\n", map {"$_->[0] $_->[1] $_->[2]"} map $_->[0], sort { $a->[2] <=> $b->[2] # int || $a->[3] <=> $b->[3] # fract || $a->[1] cmp $b->[1] # type || ( $a->[1] eq 'on' ? $a->[5] <=> $b->[5] # flagValue : $b->[5] <=> $a->[5] ); # off case } map { my ($type, $item, $index) = @$_; my($fract) = $index =~ /\.(\d+)/; my $mode = ${$tagTypes{$item}}[2]; my $flagValue = flagValue($mode); [ $_, $type, int($index), $fract, $mode, $flagValue ]; } @tagIndexes;

    It appears to produce the same outcome for me.

Re: Pimp my code - Schwartzian transform maybe?
by salva (Canon) on May 06, 2006 at 13:36 UTC
    it gets much simpler (and faster) using Sort::Key:
    use Sort::Key::Multi qw(i3_keysort); # i3_keysort => sorter sub that expects # three integer keys print join "\n", map "$_->[0] $_->[1] $_->[2]", i3_keysort { my ($type, $item, $index) = @$_; my ($ix1, $ix2) = $index =~ /^(\d+)\.(\d+)$/; my $flag = flagValue($tagTypes{$item}[2]); $flag = -$flag if $type eq 'off'; ($ix1, $ix2, $flag) } @tagIndexes;

    note that the type and flag properties can be easily collapsed on an unique index.

      and moving things around:
      use Sort::Key::Multi 'i3_keysort'; { my %sign = (on => 1, off => -1); sub flagsValue { my ($type, $item) = @_; my $flags = $tagTypes{$item}[2]; my $i = 1; for (qw(R T I F B L)) { last if index $flags, $_ >= 0 $i++ } $sign{$type} * $i; } } print join "\n", map "$_->[0] $_->[1] $_->[2]", i3_keysort { $_->[2] =~ /^(\d+)\.(\d+)$/, flagsValue($_->[0], $_->[1]) } @tagIndexes;
Re: Pimp my code - Schwartzian transform maybe?
by ikegami (Patriarch) on May 06, 2006 at 20:13 UTC

    Even better?

    { my @restore; print "$_->[0] $_->[1] $_->[2]\n" foreach map { $restore[unpack('N', substr($_, -4))] } sort map { my ($type, $item, $index) = @$_; push(@restore, $_); my ($int, $fract) = split(/\./, $index); my $flagValue = flagValue($tagTypes{$item}[2]); $flagValue = -$flagValue if $type eq 'off'; pack('NNCN', int($index), $index =~ /\.(\d+)/, $flagValue & 0xFF ^ 0x80, $#restore, ) } @tagIndexes; }

    Points of interest:

    • Negating $flagValue when $type eq 'off' removes the need for
      $aType cmp $bType
      and for
      if $aType eq 'on'

    • The above uses the default compare function ($a cmp $b) for the win! a big speed boost.

    • ${$tagTypes{$aItem}}[2]???
      $tagTypes{$aItem}[2]!!!

    • Bonus: It's coincidently a stable sort.

    Update: Upgraded the sort key generation code from the following presumably slower code:

    join '', pack('N', int($index)), pack('N', $index =~ /\.(\d+)/), sprintf('%02d', $flagValue), pack('N', $#restore)
Re: Pimp my code - Schwartzian transform maybe?
by GrandFather (Saint) on May 07, 2006 at 11:38 UTC

    Heh, here's one way to do it that avoids the string of if statements:

    sub flagValue { my ($flags, $item) = @_; my $flagOrder = ' RTBIFL'; (-1 != index $flags, substr $flagOrder, $_, 1) and return $_ for 1..length($flagOrder); }

    DWIM is Perl's answer to Gödel
Re: Pimp my code - Schwartzian transform maybe?
by wfsp (Abbot) on May 07, 2006 at 12:21 UTC
    Here's my go at the flagValue sub.

    It uses a lookup but also inverts the logic so that an 'R' will return a high number (yours returned 1). Any $flags string that contains an 'R' will be higher than any combination without an 'R' and so on for the others.

    You would need to reflect the change in your sort routine.

    I would consider processing these values beforehand maybe adding them to @tagIndexes.

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %flagvalues = ( R => 2**6, T => 2**5, B => 2**4, I => 2**3, F => 2**2, L => 1, ); my $flags = 'RRTBLXUC'; print flagValue($flags); sub flagValue{ $flags = shift; my $weight; my %unique_flags = map { $_ => undef } split //, $flags; for my $flag (keys %unique_flags){ $weight += exists $flagvalues{$flag} ? $flagvalues{$flag} : 0; # print "$flag -> $weight\n"; } return $weight; }
    Update:

    It also relies on any flag appearing only once.

    Update 2:

    Updated to only weigh a flag once.

      Interesting. A little tweaking and we come up with the following which preserves the ordering and priority encoding nature of the original function. Note in particular that the addition is replaced by a bitwise or - that's where the priority encoding bit happens (it is also unaffected by duplicate flags). The bitwise complements preserve the original sort order.

      my %flagvalues = ( R => ~(2**1 - 1), T => ~(2**2 - 1), B => ~(2**3 - 1), I => ~(2**4 - 1), F => ~(2**5 - 1), L => ~(2**6 - 1), ); sub flagValue { my $weight = ~(2**7 - 1); for (split //, shift){ $weight |= $flagvalues{$_} if exists $flagvalues{$_}; } return ~$weight; }

      DWIM is Perl's answer to Gödel