in reply to Re: Inexplicably slow regex
in thread Inexplicably slow regex

I'd also be interested in seeing the benchmarks. I can add one thing: the last Regex isn't the same as the others because the [^\n] character class is (accidently) a negated character class. Rearranging it to [\n^] should fix that.

Hays

Replies are listed 'Best First'.
Re^3: Inexplicably slow regex
by Anonymous Monk on Sep 12, 2006 at 18:52 UTC
    Nothing accidental about the negation of that character class. I'm using it with a negative look-behind assertion. It's supposed to mean: "If it's not true that the preceding character is not a linebreak". So it's either a linebreak or there's nothing there at all (beginning of string).

    Processing a file approximately 0.5MB in size, using gettimeofday for timing, I get
    First version: 0.7 seconds
    Second version: 0.003 seconds
    Third version: 0.03 seconds

    My actual regexes are slightly more complicated than the examples given so I see little speed difference between #2 and #3.
      You may wish to show your actual regex, as that's the likely source of the issue.
        The actual regular expressions start exactly as above but require additional matching text at the end. Before posting I retested my input against the example regular expressions and encountered the same performance problems so the examples should be enough for analysis.

        I'm sure providing my rather large input file would have helped but that would be difficult to manage. Luckily the user Grandfather below provided a self contained working example that illustrates the problem.