Online GREP problem

gfausel has asked for the wisdom of the Perl Monks concerning the following question:

I am a beginner here, but the program I am working on was done by a "pro". It works fine the way it weas set up, but it need updating, and he is no longer around, so it is up to me. The purpose of the program is to take user input and search in the online file and return matching records. It uses many different fields, and if a user enters more than one word, it treats the two words as one, and only returns when the pair are found together. I wanted to change this so it would find any combination of the two words. My first step was to create an array of all the words. This worked. Then I created a loop on the search function, and it kinda worked. I will post the segment here, with my comments. :

if($lT && $Title ne "*") {
 
for($index=0;$index<@tit;$index++) {
$wrd=$tit[$index];
print "$index,$wrd\n"; COMMENT - to make sure each word was processed 
+individually. Works.
`grep -i \"$wrd\" $DataIn > $DataTmp`; COMMENT - I am assuming that th
+e GREP results from DataIn are copied into DataTmp
  $DataIn=$DataTmp; COMMENT - here the files are swapped I guess.
 
 if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;print "tempb\n";} # if
 else{$DataTmp=$DataTmpa;print "tempa\n";} # else
} # for
} # if $IT
[download]

Logically I would think that this should work. Take the words "WE LOVE", the first pass thru should take all the lines that contain the word "WE" and place them into DataTmp, which is then assigned back to DataIn. The second pass should then take all the lines that contain "LOVE" and place that in DataTmp, etc. However, the final output only shows those lines that have "WE LOVE" in them in that order, which is exactly what it did before my changes. is there something simple I am overlooking, or?? Thank you, Glenn

Comment on Online GREP problem Download Code

Replies are listed 'Best First'.
Re: Online GREP problem by almut (Canon) on Apr 05, 2010 at 20:42 UTC
...so it would find any combination of the two words You could also do it without grep and temp files: `#!/usr/bin/perl my @words = qw(we love); while (<DATA>) { # all search words must appear, but order is irrelevant my $found = 0; for my $search (@words) { $found = /\Q$search/i; last unless $found; } print if $found; } __DATA__ foo we love bar love we foo bar foo love bar we foo we bar love foo bar` [download] Output: `foo we love bar love we foo bar foo love bar we` [download] In place of the special DATA handle I'm using here for demo purposes, you'd use the file handle you've opened to your input file, i.e. `open my $fh, "<", $DataIn or die "Couldn't open '$DataIn': $!"; while (<$fh>) { ...` [download]	[reply] [d/l] [select]
Re^2: Online GREP problem by gfausel (Initiate) on Apr 05, 2010 at 21:13 UTC
Interestingly enough, the file is never opened. There are 4 grep calls before my altered one. Any reason that the grep line is enclosed in ``? here are the previous calls. if($aquiSW =~ /New Stuff/i) {`grep -i \"1\$\" $DataIn > $DataTmp`; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if aquiSw if($lA && $Artist ne ".") {`grep -i \"\^$Artist\" $DataIn > $DataTmp`; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if $IA if($lL && $Label ne ""){`grep -i \"\\|$Label\" $DataIn > $DataTmp`; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if $IL if($lN && $LabelNo ne "*"){`grep -i \"\\|$LabelNo\" $DataIn > $DataTmp` +; $DataIn=$DataTmp; if($DataTmp eq $DataTmpa){$DataTmp=$DataTmpb;} # if else{$DataTmp=$DataTmpa;} # else } # if $IN [download]	[reply] [d/l]
Re^3: Online GREP problem by almut (Canon) on Apr 05, 2010 at 21:26 UTC
Interestingly enough, the file is never opened. The reason is that the external program `'grep'` is opening the file itself. Any reason that the grep line is enclosed in ``? Backticks (``) run an external program, like the `grep` here. My suggestion was to not use backticks at all, as Perl is perfectly capable of doing grep-like jobs all by itself... and probably a lot faster, too, than when calling the external `grep` multiple times. Of course, you'd have to restructure the program accordingly... AFAICT, the consecutive runs of `grep` (in combination with the use of the temp files) achieve to extract lines where all the tested conditions match. A restructured solution without using `grep` could look something like this: `... LINE: while (<$fh>) { if ($aquiSW =~ /New Stuff/i) { next LINE unless /1$/; } if ($lA && $Artist ne ".") { next LINE unless /^\Q$Artist/i; } if ($lL && $Label ne "") { next LINE unless /\\|\Q$Label/i; } # ... if ($lT && $Title ne "*") { # split $Title into @words... for my $word (@words) { next LINE unless /\Q$word/i; } } # ... # we only get here if all the tested conditions matched print; }` [download]	[reply] [d/l] [select]
Re: Online GREP problem by choroba (Cardinal) on Apr 05, 2010 at 20:48 UTC
I would say the problem lies in the "switching" of files. On the next iteration of the loop, result of the previous grep is used as input. Therefore, in your example, the word "LOVE" is being searched only on lines already containing "WE". You should grep always the original input file. But it would be faster to search for all the words at the same time, i.e. changing "WE LOVE" to "WE\\|LOVE" (by something like `$wrd =~ s/ /\\\|/g`) and greping just for this.	[reply] [d/l]
Re^2: Online GREP problem by gfausel (Initiate) on Apr 05, 2010 at 21:07 UTC
Thanks for the reply. Problem is that there are 5 diff greps, and each does the same thing, switching the files, so each iterance of the file is already "slimmed" down. This is for phonograph records, so there may have been an artist or label already GREPed. This has always worked fine, only my alteration does not. And the reason I use an array is there could be one word, 2,3,4 etc.	[reply]
Re: Online GREP problem by Anonymous Monk on Apr 05, 2010 at 20:29 UTC
Have you read perlsec and do you know about taint? I would switch to ack	[reply]
Re: Online GREP problem by gfausel (Initiate) on Apr 06, 2010 at 01:44 UTC
Ignore this post. I found that my code did work, and a line about 50 lines away was causing the problem. thanks for all help.	[reply]