in reply to Re^2: How to efficiently search for list of strings in multiple files?
in thread How to efficiently search for list of strings in multiple files?
Hello again stray_tachyon,
As I said use File::Find::Rule and convert it to your way of fitting. I also prefer to use the module IO::All that will help you to load the file in an array so you can grep all the words in one go. Why I think this approach is better? You check only specific directories for specific files. You do not need to know the name of the file just the extension. So in conclusion for me filtering out the files that you need to open is more efficient.
From my point of view this is more efficient, maybe another Monk has a better idea. But at the end why to relay on people opinions when you simply can use Benchmark::Forking if you are on LinuxOS or WindosOS Benchmark.
Sample of code:
#!/usr/bin/perl use strict; use IO::All; use warnings; use Data::Dumper; use File::Find::Rule; my @LogDirs = ('/home/user/PerlMonks/', '/tmp/Monks'); # add more here my $level = shift // 3; # level to dig into my @files = File::Find::Rule->file() ->name( '*.txt', '*.log' ) #can insert regex too ->maxdepth($level) ->in(@LogDirs); my %hash; foreach my $file (@files) { my @lines = io($file)->chomp->slurp; my $matches = grep {/Start/ or /middle/ or /end/} @lines; $hash{$file} = $matches; } print Dumper \%hash; __END__ $ perl test.pl $VAR1 = { '/home/user/PerlMonks/Foo/Bar/monks.log' => 1, '/home/user/PerlMonks/sample.txt' => 1, '/tmp/Monks/SampleDir/monks.log' => 2 };
Sample of data in files:
$ cat /tmp/Monks/SampleDir/monks.log Start line key line. Ignore this line. Another line key end. $ cat /home/user/PerlMonks/Foo/Bar/monks.log This is line 1. This key line end. $ cat /home/user/PerlMonks/sample.txt This is another key line middle point. I do not care about this.
Hope this helps, BR.
|
|---|