Re: grepping a large file and stopping on first match to a list

"Does this stop searching on the first match?"

Well, the obvious answer is run it and find out for yourself.

You haven't told us anything about 'huge_file' except how many lines it has. What's the file size? What's the average (or typical) record length? Is there anything special about the file or is it just plain text?

You haven't told anything about '@strings' except that it's "a small set ". What's the average (or typical) string length? Is there anything special about the strings?

List::Util::first() stops searching on the first match. Why do you think this may not be the case with your code?

If you're concerned about memory usage, you might want to consider the built-in module, Tie::File. Its documentation says:

"The file is not loaded into memory, so this will work even for gigantic files."

If you're concerned about speed, try several methods and Benchmark them.

I'll also point out that the Smartmatch Operator (~~) is experimental, subject to change, and not suitable for production code.

Bearing in mind all the things I don't know about your situation, here's how I might have tackled this task:

#!/usr/bin/env perl -l

use strict;
use warnings;
use autodie;

use List::Util qw{first};
use Tie::File;

my @strings = qw{rst uvw xyz};
my $re = '(?:' . join('|', @strings) . ')';

tie my @lines, 'Tie::File', 'pm_1155868_input.txt';
my $match = first { /$re/ } @lines;
untie @lines;

print "Match: '$match'";
[download]

With this data:

$ cat pm_1155868_input.txt
aaaaaaaaaaaaaa
bbbbbbbbbbbbbb
cccccccccccccc
dddddddxyzdddd
eeeeeeeuvweeee
fffffffrstffff
gggggggggggggg
hhhhhhhhhhhhhh
iiiiiiiiiiiiii
[download]

I get this result:

Match: 'dddddddxyzdddd'
[download]

That may be suitable for your needs. If so, great! If not, you'll need to provide more information: answering the questions I've already posed would be a good start; see also "How do I post a question effectively?".

— Ken

Comment on Re: grepping a large file and stopping on first match to a list Select or Download Code