G'day msh210,

"Does this stop searching on the first match?"

Well, the obvious answer is run it and find out for yourself.

You haven't told us anything about 'huge_file' except how many lines it has. What's the file size? What's the average (or typical) record length? Is there anything special about the file or is it just plain text?

You haven't told anything about '@strings' except that it's "a small set ". What's the average (or typical) string length? Is there anything special about the strings?

List::Util::first() stops searching on the first match. Why do you think this may not be the case with your code?

If you're concerned about memory usage, you might want to consider the built-in module, Tie::File. Its documentation says:

"The file is not loaded into memory, so this will work even for gigantic files."

If you're concerned about speed, try several methods and Benchmark them.

I'll also point out that the Smartmatch Operator (~~) is experimental, subject to change, and not suitable for production code.

Bearing in mind all the things I don't know about your situation, here's how I might have tackled this task:

#!/usr/bin/env perl -l use strict; use warnings; use autodie; use List::Util qw{first}; use Tie::File; my @strings = qw{rst uvw xyz}; my $re = '(?:' . join('|', @strings) . ')'; tie my @lines, 'Tie::File', 'pm_1155868_input.txt'; my $match = first { /$re/ } @lines; untie @lines; print "Match: '$match'";

With this data:

$ cat pm_1155868_input.txt aaaaaaaaaaaaaa bbbbbbbbbbbbbb cccccccccccccc dddddddxyzdddd eeeeeeeuvweeee fffffffrstffff gggggggggggggg hhhhhhhhhhhhhh iiiiiiiiiiiiii

I get this result:

Match: 'dddddddxyzdddd'

That may be suitable for your needs. If so, great! If not, you'll need to provide more information: answering the questions I've already posed would be a good start; see also "How do I post a question effectively?".

— Ken


In reply to Re: grepping a large file and stopping on first match to a list by kcott
in thread grepping a large file and stopping on first match to a list by msh210

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.