nkpgmartin has asked for the wisdom of the Perl Monks concerning the following question:

hi, i am having trouble with the second "if" statement in the script below. i only want to process the lines in the "fb.tsv" file where "Scan" appears in column 10. but in some cases the word "Scan" appears in lines i do not want to process (in say, column 20).

obviously the line if (/Scan/) will process all lines where "Scan" appears in both 10th and 20th columns. but for some reason the commented out line below returns correct values but appears to go through the file several times, and i only want it to go through the file once. this is probably a simple problem for someone who is more familiar with perl than me. any suggestions? thanks, niki
#!/usr/bin/perl # @dbarray1 = (); @dbarray5 = (); open(FA, "fa.txt") || die "Cannot find the FA file. \n"; while ($line = <FA>) { chomp($line); @dbarray = split (/ /,$line); push(@dbarray1, $dbarray[1]); push(@dbarray5, $dbarray[5]); } close(FA); open(FB, "fb.tsv") || die "Cannot find the FB file. \n"; while ($line = <FB>) { chomp($line); @array = split (/\s*\|/, $line); foreach (@array) { ($array[103] =~ m/^DB(\d{1,2})/); $db_no = $1; for ($i=0; $i<=$#dbarray1; $i++) { if (($dbarray1[$i] == $db_no) && (/Pointed/)) { $pointed_result = $array[16]*$dbarray5 +[$i]; } # if ($array[10] =~ m/Scan/) if (/Scan/) { $scan_result = $array[16]*$dbarray5[$i +]+20; } } } } close(FB);

Replies are listed 'Best First'.
Re: question about if/pattern matching statement
by kyle (Abbot) on Jul 07, 2008 at 19:59 UTC

    Look at the loops.

    while ($line = <FB>) # every line foreach (@array) # every field in that line for ($i=0; $i<=$#dbarray1; $i++) # items in @dbarray1 # if ($array[10] =~ m/Scan/) if (/Scan/)

    The first condition, ($array[10] =~ m/Scan/), will look for "Scan" in field 10 (the 11th field) n times per line where n is the number of items in @dbarray1 times the number of fields.

    The second condition, (/Scan/), will look for "Scan" in each field n times where n is the number of items in @dbarray1. This is because the regexp match is not bound to a variable, so it's matching against $_, which is being iterated by the "foreach (@array)".

    I suggest you check for "Scan" right after you split into @array:

    while ($line = <FB>) { chomp($line); @array = split /\s*\|/, $line; if ( $array[10] =~ /Scan/ ) { # happiness ensues }
      you are right - it seems i had an unnecessary loop in there. i searched for 'Scan' and 'Pointed' first, then removed the foreach @array and it works now. thanks a bunch!
Re: question about if/pattern matching statement
by dragonchild (Archbishop) on Jul 07, 2008 at 19:25 UTC
    How do you know it's going through the file multiple times? Have you created a test file with predictable values?

    Also, I'd seriously look at using Text::xSV instead of your method. For one, it's easier to understand. Two, the bugs are going to be in the logic and not in the parsing.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: question about if/pattern matching statement
by apl (Monsignor) on Jul 07, 2008 at 19:35 UTC
    Replace
    if (/Scan/)
    with
    if ( index( $_, 'Scan') == 10 )

    Please see the write-up on the index function.