in reply to Re^4: Possible to have regexes act on file directly (not in memory)
in thread Possible to have regexes act on file directly (not in memory)
it's often much simpler as you might think, cause you can decompose regexes into smaller and easier parts.
perl -e'use re 'debug';qr/x{100}.*y{100}/' Compiling REx "x{100}.*y{100}" Final program: 1: CURLY {100,100} (5) 3: EXACT <x> (0) 5: STAR (7) 6: REG_ANY (0) 7: CURLY {100,100} (11) 9: EXACT <y> (0) 11: END (0) anchored "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"... at 0 floating "yyyyyyyyyy +yyyyyyyyyyyyyyyyyyyy"... at 100..2147483647 (checking floating) minle +n 200 Freeing REx: "x{100}.*y{100}"
In this case you start looking for 'x'x100 in a sliding window of size >200 from the beginning. Then you search backwards from the end in sliding windows for 'y'x100.
Like this even greedy matches can be handled (mostly) and the total match might even cover terrabytes.
Cheers Rolf
( addicted to the Perl Programming Language)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^6: Possible to have regexes act on file directly (decompose regex)
by Laurent_R (Canon) on May 02, 2014 at 21:07 UTC | |
by LanX (Saint) on May 02, 2014 at 21:55 UTC |