gfausel has asked for the wisdom of the Perl Monks concerning the following question:

I am a beginner here, but the program I am working on was done by a "pro". It works fine the way it weas set up, but it need updating, and he is no longer around, so it is up to me. The purpose of the program is to take user input and search in the online file and return matching records. It uses many different fields, and if a user enters more than one word, it treats the two words as one, and only returns when the pair are found together. I wanted to change this so it would find any combination of the two words. My first step was to create an array of all the words. This worked. Then I created a loop on the search function, and it kinda worked. I will post the segment here, with my comments. :
if($lT && $Title ne "*") { for($index=0;$index<@tit;$index++) { $wrd=$tit[$index]; print "$index,$wrd\n"; COMMENT - to make sure each word was processed +individually. Works. `grep -i \"$wrd\" $DataIn > $DataTmp`; COMMENT - I am assuming that th +e GREP results from DataIn are copied into DataTmp $DataIn=$DataTmp; COMMENT - here the files are swapped I guess. if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;print "tempb\n";} # if else{$DataTmp=$DataTmpa;print "tempa\n";} # else } # for } # if $IT
Logically I would think that this should work. Take the words "WE LOVE", the first pass thru should take all the lines that contain the word "WE" and place them into DataTmp, which is then assigned back to DataIn. The second pass should then take all the lines that contain "LOVE" and place that in DataTmp, etc. However, the final output only shows those lines that have "WE LOVE" in them in that order, which is exactly what it did before my changes. is there something simple I am overlooking, or?? Thank you, Glenn

Replies are listed 'Best First'.
Re: Online GREP problem
by almut (Canon) on Apr 05, 2010 at 20:42 UTC
    ...so it would find any combination of the two words

    You could also do it without grep and temp files:

    #!/usr/bin/perl my @words = qw(we love); while (<DATA>) { # all search words must appear, but order is irrelevant my $found = 0; for my $search (@words) { $found = /\Q$search/i; last unless $found; } print if $found; } __DATA__ foo we love bar love we foo bar foo love bar we foo we bar love foo bar

    Output:

    foo we love bar love we foo bar foo love bar we

    In place of the special DATA handle I'm using here for demo purposes, you'd use the file handle you've opened to your input file, i.e.

    open my $fh, "<", $DataIn or die "Couldn't open '$DataIn': $!"; while (<$fh>) { ...
      Interestingly enough, the file is never opened. There are 4 grep calls before my altered one. Any reason that the grep line is enclosed in ``? here are the previous calls.
      if($aquiSW =~ /New Stuff/i) {`grep -i \"1\$\" $DataIn > $DataTmp`; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if aquiSw if($lA && $Artist ne ".*") {`grep -i \"\^$Artist\" $DataIn > $DataTmp`; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if $IA if($lL && $Label ne "*"){`grep -i \"\|$Label\" $DataIn > $DataTmp`; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if $IL if($lN && $LabelNo ne "*"){`grep -i \"\|$LabelNo\" $DataIn > $DataTmp` +; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if $IN
        Interestingly enough, the file is never opened.

        The reason is that the external program 'grep' is opening the file itself.

        Any reason that the grep line is enclosed in ``?

        Backticks (``) run an external program, like the grep here.

        My suggestion was to not use backticks at all, as Perl is perfectly capable of doing grep-like jobs all by itself...  and probably a lot faster, too, than when calling the external grep multiple times.  Of course, you'd have to restructure the program accordingly...

        AFAICT, the consecutive runs of grep (in combination with the use of the temp files) achieve to extract lines where all the tested conditions match.  A restructured solution without using grep could look something like this:

        ... LINE: while (<$fh>) { if ($aquiSW =~ /New Stuff/i) { next LINE unless /1$/; } if ($lA && $Artist ne ".*") { next LINE unless /^\Q$Artist/i; } if ($lL && $Label ne "*") { next LINE unless /\|\Q$Label/i; } # ... if ($lT && $Title ne "*") { # split $Title into @words... for my $word (@words) { next LINE unless /\Q$word/i; } } # ... # we only get here if all the tested conditions matched print; }
Re: Online GREP problem
by choroba (Cardinal) on Apr 05, 2010 at 20:48 UTC

    I would say the problem lies in the "switching" of files. On the next iteration of the loop, result of the previous grep is used as input. Therefore, in your example, the word "LOVE" is being searched only on lines already containing "WE".

    You should grep always the original input file. But it would be faster to search for all the words at the same time, i.e. changing "WE LOVE" to "WE\|LOVE" (by something like $wrd =~ s/ /\\|/g) and greping just for this.

      Thanks for the reply. Problem is that there are 5 diff greps, and each does the same thing, switching the files, so each iterance of the file is already "slimmed" down. This is for phonograph records, so there may have been an artist or label already GREPed. This has always worked fine, only my alteration does not. And the reason I use an array is there could be one word, 2,3,4 etc.
Re: Online GREP problem
by Anonymous Monk on Apr 05, 2010 at 20:29 UTC
    Have you read perlsec and do you know about taint?

    I would switch to ack

Re: Online GREP problem
by gfausel (Initiate) on Apr 06, 2010 at 01:44 UTC
    Ignore this post. I found that my code did work, and a line about 50 lines away was causing the problem. thanks for all help.