Re^2: Backtracking hurts: slow regexp

Sir, you're the man. Obviously you've understood perfectly what I was trying to achieve: parsing column-delimited HTML data. But I don't understand why is is your expression equivalent to mine. I'm a newbie at perl, and reading the perlreref I guessed that the solution might have been in ?>, but to be honest, my head hurts when I try to make head or tails of it. Could you please try to explain, in the simpler possible way, how does ?> work?

Comment on Re^2: Backtracking hurts: slow regexp Select or Download Code

Replies are listed 'Best First'.
Re^3: Backtracking hurts: slow regexp by massa (Hermit) on Dec 19, 2008 at 13:38 UTC
From the forementioned perlreref: `(?>...) Grab what we can, prohibit backtracking` [download] that's it. it does not allow backtracking. so, the `(?>.?<\/td>){9}`will get exactly 9* instances of (non-greedy) anything followed by `</td>`... it won't try to go till the end of the string chasing the longest `.*` (because it is not greedy) and if the last `.` of the sequence is not followed by `</td>`, it will fail without backtracking (working more or less as a deterministic automaton). []s, HTH, Massa (κς,πμ,πλ)	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: Backtracking hurts: slow regexp
by massa (Hermit) on Dec 19, 2008 at 13:38 UTC

perlreref

  (?>...)           Grab what we can, prohibit backtracking
[download]

(?>.*?<\/td>){9}

exactly 9

</td>

.*

.

</td>

[]s, HTH, Massa (κς,πμ,πλ)

[reply]
[d/l]
[select]