Faster way?

the_slycer has asked for the wisdom of the Perl Monks concerning the following question:

Greetings

I have written a script that searches a list of text files for both the filename and searches the text of the files for a match (depending on which button you push, this is TK'd). The filename search is obviously very fast. But the "grep" of the files is extremely slow. I was hoping I could find a way to speed this up. There is a list of 250ish files, and they amount (in total) to no more than 350k. I have implemented a poor mans "cache" to try and speed up the second search for the same value. Here is the code snippet that I'm using for the search:

foreach $filename(@file_array){
        chomp ($filename);
    open (FH,"$filename") or warn "Could not open $filename $!";
    while ($line = <FH>){
        if ($line =~ /.*$search_value*/i){
            ++$matched;
        $file_listbox->insert('0',"$filename--> $line");
        open (RECENT,">>$installpath/recent");
        print RECENT "$search_value= $filename--> $line";
        $numhash{"$search_value"}="true";
        close (RECENT);
        }
        }
    close (FH);
}
[download]

The search tool shows (as is obvious from above) the whole line that the search value was found on.

Any assistance would be deeply appreciated.

Comment on Faster way? Download Code

Replies are listed 'Best First'.

Re: Faster way?
by Fastolfe (Vicar) on Oct 06, 2000 at 22:08 UTC

/.*$search_value*/

$search_value

abc

xyzab

*

c

$search_value

chomp(@file_array);
open(RECENT, ">>$installpath/recent");  # or just >
foreach $filename (@file_array) {
   open(FH, "< $filename") or warn "Could not open $filename: $!";
   while (<FH>) {
      if (/$search_value/o) {
         print RECENT "$search_value=$filename--> $_";
         $numhash{$search_value}++;
         $matched++;
      }
   }
   close(FH);
}
close(RECENT);
[download]

RECENT

$search_value

/o

last

If your 'recent' file is a temporary/transient thing, used only for processing later in your script, you might also want to consider just storing your matches in an internal data structure, and use them later instead of reading from your file: