The slurp technique shown is okay if you have small files. I know I have servers where slurping the biggest disk file would make a serious dent on the RAM, enough to start it swapping into oblivion.

The other thing I'm not sure of is whether these patterns occur on the same line, or on different lines. The former case is easy to solve, the latter case involves seeing whether you've seen one after having seen the other:

Assuming you have a variable named $lacking_both that keeps count, your file read loop would look like:

my $saw_alpha = 0; my $saw_charlie = 0; while( <DAT> ) { if( /alpha/ ) { ++$saw_alpha; last if $saw_charlie; } if( /charlie/ ) { ++$saw_charlie; last if $saw_alpha; } } ++$lacking_both unless $saw_alpha and $saw_charlie;

For a once off this will do, but it might be worth generalising this to use a hash instead of discrete scalars if you need to look for an arbitrary number of patterns.

I'd probably also be tempted to

push @lacking_both, $name

So that @lacking_both in a scalar context gives you the number of files lacking both patterns, but also the names of the files themselves, because you are probably going to need that information somewhere down the track anyway.

update: another thing I just thought of: a big strike against slurping the file is that you might slurp the whole file, only to find out that you hit both patterns in the first 10 lines of the file. In which case a short-circuiting last, as shown above, will be a big win.


In reply to Re: Finding pages without specific words by grinder
in thread Finding pages without specific words by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.