archeman2 has asked for the wisdom of the Perl Monks concerning the following question:

I had a problem that I created a broken solution for, and now I have discovered that the original problem is more complex than I originally supposed it was. I have a set of fixed strings that each have a numeric value in a file in the form:

I need to find strings that match a hard-coded array of strings that usually looks like this:

string2:string4:totalName

Here are my Rules:

  1. IF one or more of the strings (not including the last element) of the Array are found, then a Total row must be appended to the file and combined totals of the values of those strings,# into a new row with a 'name' string that matches the last element of the array and has the Value of the total of the values matched
  2. IF only 1 match is found the totalName,# must still be created applying the matched value
  3. IF no matches are found the totalName row will NOT be created
  4. There may be more than 2 string elements in the array

So given a file with values that match the bullet list above, if the Array looks like this:

string2:string4:totalName

Then the line "totalName,114" would get added to the end of the file.

If the Array looks like this:

string1:string4:string2:otherName

Then the line "otherName,148" would get added to the end of the file.

If the Array looks like this:

string7:string9:noName

Then no line at all would get added to the end of the file. My original code was buggy because if either one of the two elements was missing, it didn't create the Total row in the file. Also the requirements grew when I found I needed to be able to combine more than two rows of the file.

open( FILE, '<', $sumOutPath ) or warn "!!! Cannot Open $sumOutPath !! +!\n"; @LINES = (); @LINES = <FILE>; my @LINES2 = @LINES; close FILE; # For each row we may need to foreach my $LINE (@LINES) { chomp $LINE; if (index($LINE, $rpcComboStatsSegments[0]) != -1) { ### This r +ow needs to be fixed to by a loop for all Stats if the first or 2nd i +sn't found. if($debug) { print "--->>> $LINE contains $rpcComboStatsSegmen +ts[0]\n"; } $combinedTotal = (split /,/,$LINE)[1]; if($debug) { print "Combined Total = $combinedTotal\n"; } foreach my $LINE2 (@LINES2) { chomp $LINE2; if($debug) { print "---->>>> full line $LINE2 and $rpcCombo +StatsSegments[1]\n"; } if (index($LINE2, $rpcComboStatsSegments[1]) != -1) { if($debug) { print "---->>>> $LINE2 contains $rpcComboSt +atsSegments[1]\n"; } $combinedTotal = $combinedTotal+(split /,/,$LINE2)[1]; #print "Combined Total now = $combinedTotal\n"; #we want to Append a Total RPC statistic to the end of t +he SummaryOut file open( FILE, '>>', $sumOutPath ) or warn "!!! Cannot Open + $sumOutPath !!!\n"; print FILE $rpcComboStatsSegments[2] . "," . $combinedTo +tal . "\n"; close FILE; last; } } } }
Just wondering if there is a preferred logical approach to matching 'lists' to other 'lists'? It feels like I may be doing this the hard way.

Replies are listed 'Best First'.
Re: flexible string value matching in lists
by BrowserUk (Patriarch) on Feb 16, 2017 at 01:40 UTC
    Just wondering if there is a preferred logical approach to matching 'lists' to other 'lists'?

    Yes. Put one of the lists in a hash.

    In your case, if you make a hash of your fixed strings:

    my %lookup = ( string1=>34, string2=>10, string3=>52, string4=>104, st +ring5=>7 );

    Then your code becomes a simple process of splitting your complex strings and doing lookups:

    my @bits = split ':', 'string2:string4:totalName'; my $name = pop @bits; my $total = 0; for my $bit ( @bits ) { $total += $lookup{ $bit }; } print OUTFILE "$name,$total";

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I agree with your core suggestion that moving the file data to a hash instead of an array allows me to perform un-ordered/non-sequential matches which is the main problem with my previous approach. Thank you for that !