I think I understand what you're talking about, but I'm not sure. You see, the problem is the way you framed the question. The first line of the post suggests that you have a performance problem. In general, a performance problem is one in which systems are being overloaded or time constraints aren't being met.

From the rest of the post however, I get the impression that this is just something you do every now and then, and that its just a little slow and maybe you have to go make a cup of coffee while it runs.

The two are very different. If you are hunting this data regularly in a constrained environment, there are numerous techniques you can use to boost the performance of your search, including sorted character and pair indexes etc.

On the other hand, these will take effort to implement. If you're just getting bored of waiting and want a faster regex, the one you put in the update is probably going to be about it. You won't be able to remove the backtracking entirely, in a worst case example imagine you're looking for "teatea" in the string "teateteatea". Its going to backtrack no matter what regular expression you use to pull off the match.

The other viable alternative is to push all the data into a decent database and let it worry about it. All depends on how often the data changes and how often you do the searches.

It's a pity that you didn't specify those factors :/


In reply to Re: Removing backtracking from a .*? regexp by Anonymous Monk
in thread Removing backtracking from a .*? regexp by grinder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.