madbombX has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

With all this discussion of Dispatch Tables (apparently brought forward mostly by me), I have been attempting to implement them in places I think they would make my code more maintainable and more efficient. In more than a few places, I have successfully replaced chains of if/else statements with dispatch tables and a successful speedup (sometimes of over 300% on longer chains).

However, I have come to a situation that I am not sure weather or not to leave the current code in place (which is not very clean) or attempt to replace it. The speed of the code is somewhat important as the module gets called (loaded, used, and then unloaded) every time an email comes in (which happens more than just a few thousand times per day). Since these are all integer comparisons, they should all be fast anyway, but there might just be a more elegent way of doing this.

Currently the code is made up of chained conditionals (which happens to be slightly beyond the tipping point of 12 that Limbic~Region proved in When should I use a dispatch table?. However, I don't see (probably through lack of knowledge) how one would implement a dispatch table since 2 tests are required to obtain an outcome (or $rv) and the hash can only have one key.

# data characteristics # 0.0001 <= $conf < 1.0000 # 0.0001 <= $prob <= 1.0000 my $outcome = ( (($prob-0.5)*2) + ($conf*100)); # Current code if ($outcome > 97) { $rv .= "99"; } elsif ( ($outcome > 93) && ($outcome <= 97) ) { $rv .= "95"; } elsif ( ($outcome > 85) && ($outcome <= 93) ) { $rv .= "90"; } elsif ( ($outcome > 75) && ($outcome <= 85) ) { $rv .= "80"; } elsif ( ($outcome > 65) && ($outcome <= 75) ) { $rv .= "70"; } elsif ( ($outcome > 55) && ($outcome <= 65) ) { $rv .= "60"; } elsif ( ($outcome > 45) && ($outcome <= 55) ) { $rv .= "50"; } elsif ( ($outcome > 35) && ($outcome <= 45) ) { $rv .= "40"; } elsif ( ($outcome > 25) && ($outcome <= 35) ) { $rv .= "30"; } elsif ( ($outcome > 15) && ($outcome <= 25) ) { $rv .= "20"; } elsif ( ($outcome > 7) && ($outcome <= 15) ) { $rv .= "10"; } elsif ( ($outcome > 3) && ($outcome <= 7) ) { $rv .= "05"; } elsif ( ($outcome > 0) && ($outcome <= 3) ) { $rv .= "00"; } else { return "ERROR"; }

Currently, my only 2 thought processes are to stay with the current code or change it to use Switch (which I have yet to Benchmark for my purpose). Thanks in advance. Thoughts?

Replies are listed 'Best First'.
Re: Efficiency & Maintenance: if/elsif or Other Dispatch Method
by jdporter (Paladin) on Dec 01, 2006 at 18:14 UTC

    This has come up before; but I can tell you right off the bat your code is more complicated than it needs to be. In the following:

    if ($outcome > 97) { $rv .= "99"; } elsif ( ($outcome > 93) && ($outcome <= 97) )
    You don't need the second condition in the second case (<= 97) because it will always be true if that test is reached, due to the preceding test.
    if ( $outcome > 97 ) { $rv .= "99"; } elsif ( $outcome > 93 ) { $rv .= "95"; } elsif ( $outcome > 85 ) { $rv .= "90"; } elsif ( $outcome > 75 ) { $rv .= "80"; } elsif ( $outcome > 65 ) { $rv .= "70"; } elsif ( $outcome > 55 ) { $rv .= "60"; } elsif ( $outcome > 45 ) { $rv .= "50"; } elsif ( $outcome > 35 ) { $rv .= "40"; } elsif ( $outcome > 25 ) { $rv .= "30"; } elsif ( $outcome > 15 ) { $rv .= "20"; } elsif ( $outcome > 7 ) { $rv .= "10"; } elsif ( $outcome > 3 ) { $rv .= "05"; } elsif ( $outcome > 0 ) { $rv .= "00"; }

    We're building the house of the future together.
      A slightly shorter version of the above code.
      $rv .= $outcome > 97 ? 99 : $outcome > 93 ? 95 : $outcome > 85 ? 90 : $outcome > 75 ? 80 : $outcome > 65 ? 70 : $outcome > 55 ? 60 : $outcome > 45 ? 50 : $outcome > 35 ? 40 : $outcome > 25 ? 30 : $outcome > 15 ? 20 : $outcome > 7 ? 10 : $outcome > 3 ? '05' : $outcome > 0 ? '00' : q{};
Re: Efficiency & Maintenance: if/elsif or Other Dispatch Method
by davido (Cardinal) on Dec 01, 2006 at 18:09 UTC

    You could bring your limits and calculated rv's into an array of arrays, and then iterate through it to test outcomes. See below:

    use strict; use warnings; my @outcome_tests = ( [ 97, 100, '99' ], [ 93, 97, '95' ], [ 85, 93, '90' ], [ 75, 85, '80' ], [ 65, 75, '70' ], [ 55, 65, '60' ], [ 45, 55, '50' ], [ 35, 45, '40' ], [ 25, 35, '30' ], [ 15, 25, '20' ], [ 7, 15, '10' ], [ 3, 7, '05' ], [ 0, 3, '00' ], ); for( 1 .. 20 ) { # Run the test 20 times. my $outcome = int( rand( 100 ) + 1 ); # Generate some test data. my $rv = ''; TEST: { foreach my $test ( @outcome_tests ) { my( $lower, $upper, $category ) = @{ $test }; if( $outcome > $lower and $outcome <= $upper ) { $rv .= $category; last TEST; } } die "ERROR: $outcome is out of bounds.\n"; } print "$outcome => $rv\n"; }

    One advantage to this solution is that it turns your limits and categories into data, rather than code. You can add or remove limit pairs without altering the code, and without affecting complexity.


    Dave

Re: Efficiency & Maintenance: if/elsif or Other Dispatch Method
by Not_a_Number (Prior) on Dec 01, 2006 at 18:59 UTC

    A solution that might lack elegance but should be pretty fast is a simple hash look-up. Construct a hash with keys running from 1 .. 100 and values equivalent to the 'probability' for each key. Something like this:

    my $outcome = shift; my %h = map { $_ => 10 * int ( ( $_ + 4 ) /10 ) } 8 .. 93; @h{1 .. 3} = (0) x 3; @h{4 .. 7} = (5) x 5; @h{94 .. 97} = (95) x 4; @h{98 .. 100} = (99) x 3; print defined $h{$outcome} ? sprintf "%02d\n", $h{$outcome} : 'ERROR';

    Update: Just realised that your numbers don't have to be integers. If so, change the beginning of the above to:

    use POSIX; my $outcome = ceil( shift );
Re: Efficiency & Maintenance: if/elsif or Other Dispatch Method
by BrowserUk (Patriarch) on Dec 01, 2006 at 19:18 UTC

    You can do this much more efficiently, O(1), by using a lookup table.

    my @lookup = ( '00', map { ( $_->[2] ) x ( $_->[ 1 ] - $_->[ 0 ] ) } [ 0, 3, '00' ], [ 3, 7, '05' ], [ 7, 15, '10' ], [ 15, 25, '20' ], [ 25, 35, '30' ], [ 35, 45, '40' ], [ 45, 55, '50' ], [ 55, 65, '60' ], [ 65, 75, '70' ], [ 75, 85, '80' ], [ 85, 93, '90' ], [ 93, 97, '95' ], [ 97, 100, '99' ] ); print "@lookup"; $rv .= $lookup[ ((( $prob - 0.5 )*2 )+( $conf * 100 )) ];

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Efficiency & Maintenance: if/elsif or Other Dispatch Method
by Fletch (Bishop) on Dec 01, 2006 at 18:46 UTC

    And as is periodically mentioned, the conventional wisdom on using Switch in production is "you shouldn't" (straight from the horse's mouth, Categorized Damian Modules).

    Update: And as for suggestions, since you've got a predicate rather than a simple key you could still use a dispatch table, it's just not as direct. I haven't looked at your sample data directly, but often you can find some mathematical function which converts a value to a "bin" number (as a really trivial example, something like int( $value / 10 ) to map a value 0..100 into bins 0..10). You then use that function as the key in your dispatch table instead.

      Fletch,
      A similar module (Case) by Roy Johnson is not written using source filters or evil goto. It may not suit madbombX's purpose (performance) but for clarity it may be a nice alternative.

      Cheers - L~R

Re: Efficiency & Maintenance: if/elsif or Other Dispatch Method
by Limbic~Region (Chancellor) on Dec 01, 2006 at 19:27 UTC
    madbombX,
    It has already been pointed out that the number of conditions you need to account for are small enough that you can use a hash. Notice I didn't say dispatch table. This is because no code need be dispatched.
    my $rv = $lookup{$outcome};
    In other situations, it is resource prohibitive to store every possible condition ($outcome in your example) in the hash though the number of end states ($rv in your example) is quite manageable. In these situations I like to combine a smart search routine with a dispatch table:
    $dispatch->{bin_search($outcome)}->();
    In this particular case I have implied a binary search but it could be sequential as you have shown if that scenario fits your data. The point is you abstract finding the item in a list as well as what action to take upon the result.

    Cheers - L~R