in reply to Really Long if/elsif/else blocks

One suggestion I'd make (if you can't use routing tables) would be to change:
elsif($checkPrefix and # This is the main check for prefi +xed forms &is_prefixed_partspeechAnnotated($base)){ # wordCache{base} + gets filled in. unless($tok eq $base){ #$prefixLessFm Mostly, base WILL e +q tok, but catch the odd case. &equivWord($tok, $base, {atail => $atail})}; 1} # If $checkPrefix is false, then we've already stripped a prefix, # so it can't be an abbreviation, a number or an uncannical form. elsif($checkPrefix and $canon{$tok} and $canon{$tok} ne $tok){ &driveBase($canon{$tok}, {%$Args, checkPrefix=>0}); &equivWord($tok, $canon{$tok}, {})} elsif(($sing) = $tok =~ /(.+)S$/ and $canon{$sing} and $canon{$sin +g} ne $tok){ # So REQTS => REQT => REQUIREMENT &equivWord($tok, $canon{$sing}, {plural=>1})} elsif($checkPrefix and $abbrev{$tok}){ # Not normally invoked, +as abbrevs have already been expanded usually. &equivWord($tok, $abbrev{$tok}, {abbreviation =>1})} elsif($checkPrefix and (&part_number_p($base) or $atail eq 'PARTNO')){ # P/N and PN: <partno> get transformed in immediate_transforms +; see the improves file &baseInscript($tok, {group => 'PARTNO', root => $tok})} elsif($checkPrefix and &power_loc_p($tok)){ &baseInscript($tok, {group => 'POWERLOC', root => $tok})} elsif($checkPrefix and &is_loc_code($tok)){ &baseInscript($tok, {group => 'LOC_CODE', root => $tok})} elsif($checkPrefix and $savbase = &is_measure_physics_frac($tok)){ $num_item{$tok} = 1 unless defined($num_item{$tok}); &baseInscript($tok, {group => $savbase, num_item =>1, root => $tok})}
to
elsif( $checkPrefix ) { CheckPrefaces(); }

Where CheckPrefaces would be a new routine that contained:

if(is_prefixed_partspeechAnnotated($base)){ # wordCache{base} gets + filled in. unless($tok eq $base){ #$prefixLessFm Mostly, base WILL e +q tok, but catch the odd case. &equivWord($tok, $base, {atail => $atail})}; 1} # If $checkPrefix is false, then we've already stripped a prefix, # so it can't be an abbreviation, a number or an uncannical form. elsif($canon{$tok} and $canon{$tok} ne $tok){ &driveBase($canon{$tok}, {%$Args, checkPrefix=>0}); &equivWord($tok, $canon{$tok}, {})} elsif(($sing) = $tok =~ /(.+)S$/ and $canon{$sing} and $canon{$sin +g} ne $tok){ # So REQTS => REQT => REQUIREMENT &equivWord($tok, $canon{$sing}, {plural=>1})} elsif($abbrev{$tok}){ # Not normally invoked, as abbrevs have a +lready been expanded usually. &equivWord($tok, $abbrev{$tok}, {abbreviation =>1})} elsif((&part_number_p($base) or $atail eq 'PARTNO')){ # P/N and PN: <partno> get transformed in immediate_transforms +; see the improves file &baseInscript($tok, {group => 'PARTNO', root => $tok})} # etc., etc. etc.

That is, remove the constant test from the secondary conditions.

Replies are listed 'Best First'.
Re^2: Really Long if/elsif/else blocks
by throop (Chaplain) on May 30, 2008 at 20:12 UTC
    Well.. it would look cleaner. But I don't think it would fall through correctly. You see, if none of those
       elsif($checkPrefix and whatever)
    tests succeed, it still needs to fall through to the following tests, even if $checkPrefix were true.
      I tend to agree with apl on this point (and I also agree with the person who suggested changing your indent/line-break style). Apart from that, I was struck by the variety of ways in which you phrase similar conditions, and by the seemingly arbitrary ordering of conditions.

      If your "split_token()" sub is returning a 3-element list whose members vary significantly among all of: /"undef", "defined but empty string", "non-empty string but evaluates to 0", "non-empty, non-zero value"/, I think you should be making that more clear. When I see conditions like:

      if ( $atail and $ntail ne '' ) { ... elsif ( !$atail and $ntail ne '' ) { ... elsif ( not ($atail eq '' and $ntail eq '' )) {
      I can't help thinking that it should probably be:
      if ( $atail and $ntail ) { ... elsif ( $atail ) { ... elsif ( $ntail ) { ...
      If it needs to be different from that, you should be really clear about why it needs to be different.

      I was puzzled by the last elsif condition:

      elsif($checkPrefix and $unprefBase and # Set in an earlier test, above $savgrp = &group_of($unprefBase, 1) and # This time, try to +derive a group for the base. &member($metagroup{$savgrp}, 'NVRB', 'NOUN','VERB','ADJECTIV +E','ADVB', 'GERUND')){}
      When I looked for "$unprefBase" above that point, I found it only once, near the top of the sub:
      my($sing, $unprefBase, $savgrp, $savbase);
      Nothing is ever assigned to $unprefBase, so the following two parts of that last elsif condition will never be evaluated -- and the block itself is empty anyway, so why have that condition at all? (That's the first thing I'd get rid of, if the idea is to shorten the list of elsif blocks.) You might want to set up some test data, such that you believe it should exercise every block, and then see whether it really does get into every block.

      I think the issue is not so much to shorten the list of elsif conditions, but rather to organize them into something like a coherent order and structure, so that sequential dependencies, and the extra assignment steps that affect outcomes, are more easily (and logically) identifiable as such.

      Another point: at the top you say "driveBase returns true for success...", but then it contains recursive calls to driveBase where the return value is not checked. In fact, since "wordCache{$tok}" is the last statement in the sub, it will only return false if this hash element is 0, empty string, or undef. And what would that mean, exactly?

      Good catch. The new sub should be a function then that would return a 1 by default, but a 0 if none of the conditions were met (i.e. the final else would set it).

      Then the conditional in the original code would be

      elsif($checkPrefix and CheckPrefaces() ) { } # do *nothing*; processing was handled in CheckPrefaces

      It looks a trifle odd, but I still prefer factoring out a related block of code. Everything works fine in the original body of code, but someday it will have to be modified, and the supporting programmers eyes will glaze over...