in reply to Re: How to optimize a regex on a large file read line by line ?
in thread How to optimize a regex on a large file read line by line ?

The code is incomplete cause a match could span two chunks.

You need to seek back the longest possible match (here 8) before reading the next chunk.

Actually the correct number is something like min ( p ,m )

With p = chunksize - pos

and m = length of longest possible match

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!

Replies are listed 'Best First'.
Re^3: How to optimize a regex on a large file read line by line ?
by Anonymous Monk on Apr 18, 2016 at 02:27 UTC

    The match is only in one line, that's the purpose of the line

    $_ .= <$fh> // '';

    It completes a partial line.

      Ahh! You are combining read with readline ...

       $_ .= <$fh> // ''; # finish partial line

      That's a good trick!

      (As long as a line doesn't become bigger than memory, but that's hardly the case here.)

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!