> For some regexes, it is very easy, for others, it is very difficult or even impossible.

it's often much simpler as you might think, cause you can decompose regexes into smaller and easier parts.

perl -e'use re 'debug';qr/x{100}.*y{100}/' Compiling REx "x{100}.*y{100}" Final program: 1: CURLY {100,100} (5) 3: EXACT <x> (0) 5: STAR (7) 6: REG_ANY (0) 7: CURLY {100,100} (11) 9: EXACT <y> (0) 11: END (0) anchored "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"... at 0 floating "yyyyyyyyyy +yyyyyyyyyyyyyyyyyyyy"... at 100..2147483647 (checking floating) minle +n 200 Freeing REx: "x{100}.*y{100}"

In this case you start looking for 'x'x100 in a sliding window of size >200 from the beginning. Then you search backwards from the end in sliding windows for 'y'x100.

Like this even greedy matches can be handled (mostly) and the total match might even cover terrabytes.

Cheers Rolf

( addicted to the Perl Programming Language)


In reply to Re^5: Possible to have regexes act on file directly (decompose regex) by LanX
in thread Possible to have regexes act on file directly (not in memory) by Nocturnus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.