in reply to Re^2: grepping a large file and stopping on first match to a list
in thread grepping a large file and stopping on first match to a list

Hello msh210,

Ah, so it's the loading of the entire file that's slowing me down. And you say that your and kcott's solutions don't avoid that.

No, Anonymous Monk did not say that! Quite the contrary: he and kcott are recommending that you read the file line-by-line, stopping when the first match is found. (kcott’s quotation from the Tie::File documentation specifically rules out any need to read in the whole file.)

Of course, this assumes that if a match occurs at all, it will occur within a single line. If your matches may overlap two or more lines, you will need to adapt the approach by employing a sliding window technique to examine n lines at a time, moving the window forward each time by discarding the first line and adding the next (n + 1th) line to the window. Then the key task is to determine the optimum size of n — which must be large enough to ensure that all possible matches are accommodated. To find the most efficient size for n, a certain amount of trial-and-error is usually required.

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^4: grepping a large file and stopping on first match to a list
by msh210 (Monk) on Feb 23, 2016 at 15:32 UTC

    Ah, I hadn't understood. Thank you.

    $_="msh210";$"=$\;@_=@{[split//,uc]}[2,0];$_="@_$\1";$\=$/;++$_[0]for$...1;print lc substr crypt($_,"@_"),1,6