kabeldag has asked for the wisdom of the Perl Monks concerning the following question:

I want to avoid millions of *if* tests with the following situation, but I'm having a bit of a *braino*.

I'm reading values from lines of data of a certain format. Each line of data is different depending on the known value format for that line of data. There are about 30 different line data formats. Each data format has different values in different places within the line. Given that I know what each value is, and where it is located within the line, I can read those values in and assign them to a descriptive hash-key respectively. Though I currently can't see past checking the format type with an if statement, and putting the values in the given line into their respective hash-keys via a per-known-format regular expression match.

For example, here are the lines, and the matching formats respectively:

# Line1: icecream "toasted sandwich" lemonade 33 # Line2: icecream "toasted sandwich" 14 # Line3: "fish & chips" 12 cola icecream # Format1: dessert meal drink cost # Format2: dessert meal cost # Format3: meal cost drink dessert

Heres a mock-up of how I am assigning the hash-key values:

use strict; #use warnings; my %dockets; my $format = 'format1'; my $data = <<TEST_DATA; icecream "toasted sandwich" lemonade 8 icecream "hamburger BLT" water 12 icecream "tuna slosh" whiskey 22 icecream "dog food" gatorade 43 TEST_DATA my @known_formats = ( [ 'format1', '(.+)\s\"(.+)\"\s(.+)\s(\d{1,2})' ], [ 'format2', '(.+)\s\"(.+)\"\s(\d{1,2})' ], [ 'format3', '\"(.+)\"\s(\d{1,2})\s(.+)\s(.+)' ], ); sub getfields { my @data_lines = split /\n/, shift; my ($meal, $drink, $dessert, $cost); my $docket_cnt = 0; for my $line (@data_lines) { if($format eq 'format1') { my $num_fields = ( $dockets{$docket_cnt}{'dessert'}, $dockets{$docket_cnt}{'meal'}, $dockets{$docket_cnt}{'drink'}, $dockets{$docket_cnt}{'cost'} ) = $line =~ /$known_formats[0][1]/; $docket_cnt++; }elsif($format eq 'format2') { my $num_fields = ( $dockets{$docket_cnt}{'dessert'}, $dockets{$docket_cnt}{'meal'}, $dockets{$docket_cnt}{'cost'} ) = $line =~ /$known_formats[1][1]/; $docket_cnt++; }elsif($format eq 'format3') { my $num_fields = ( $dockets{$docket_cnt}{'meal'}, $dockets{$docket_cnt}{'cost'}, $dockets{$docket_cnt}{'drink'}, $dockets{$docket_cnt}{'dessert'} ) = $line =~ /$known_formats[2][1]/; $docket_cnt++; } } } getfields($data); for my $docket_num (sort { $dockets{$a} <=> $dockets{$b} } keys %docke +ts) { print "=============================\n"; print "[$docket_num] Meal: ", $dockets{$docket_num}{'meal'}, "\n"; print "[$docket_num] Drink: ", $dockets{$docket_num}{'drink'}, "\n +"; print "[$docket_num] Dessert", $dockets{$docket_num}{'dessert'},"\ +n"; print "[$docket_num] Cost: ", $dockets{$docket_num}{'cost'},"\n"; }

What I've done works well enough, it's just that I don't want to have 3 thousand if statements.
..suffering from mental-block... Hope I'm making sense.

Replies are listed 'Best First'.
Re: Avoiding multiple *if* statements
by Limbic~Region (Chancellor) on Jul 05, 2008 at 13:44 UTC
    kabeldag,
    In addition to Anonymous Monk's advice, see Implementing Dispatch Tables for the general case. If you have a situation where a different block of code needs to be run depending on your 30 different formats and you can determine the format relatively easy - dispatch tables are the way to go.

    Cheers - L~R

Re: Avoiding multiple *if* statements
by Anonymous Monk on Jul 05, 2008 at 12:40 UTC
    like this
    my %known_formats = ( 'format1', [ [qw[ meal drink cost ]], '(.+)\s\"(.+)\"\s(.+)\s(\d{1,2})' ] ); for my $format ( keys %known_formats ){ my $re = $known_formats{$format}[1]; my $fields = $known_formats{$format}[0]; $dockets{$docket_cnt}{@$fields} = $line =~ /$re/; }
      I was thinking of doing something like that with the line fields/values, but within the original array.
      Anyhow, thanks very much for that! Much appreciated :-)
Re: Avoiding multiple *if* statements
by johngg (Canon) on Jul 05, 2008 at 19:59 UTC
    Not solving your problem but just a point of interest, in your regex, e.g. '(.+)\s\"(.+)\"\s(\d{1,2})' you are escaping the double quotes. That isn't necessary as they are not regular expression metacharacters and you aren't using them in a double quoted string. You might want to consider storing the pre-compiled patterns rather that pattern strings by using qr.

    I hope this is useful.

    Cheers,

    JohnGG

      True. I guess I'm just so used to escaping them in an interpolated string. I'm not using the qr operator in the interest of compatibility -- that was a conscious decision.
        Just as a matter of curiosity — compatibility with what? The qr{ } operator has been around for a long time, now.