in reply to safe untrusted regexp

Perl won't let you compile regexps that contain (?{...}) or (??{...}) blocks during runtime unless you also declare use re 'eval'. That won't stop someone from giving you a regexp that's designed to run out of C stack. You could upgrade to the 5.9.3+ regexp engine which isn't recursive and is now fully reentrant to solve that second problem. There are patches against earlier versions of perl but I don't have them handy to link to. Perhaps someone else will.

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Replies are listed 'Best First'.
Re^2: safe untrusted regexp
by jettero (Monsignor) on Aug 16, 2006 at 15:37 UTC
    I just hand checked this "won't let you compile regexps" business. I'm completely surprised by that, thanks.

    Besides regexps that never finish, is there anything I actually do need to worry about qr-ing untrusted user expressions?

      I didn't say it directly but now I will. A regexp on perl's recursive regexp engine can cause it to run out of C stack which then triggers a segfault. That aborts your program. There are patches to perl for versions lie 5.8.4+ (or similar) to either mitigate this or completely work around it. This problem is completely gone in 5.9.4. You could upgrade to that immediately if you wished. It was just released yesterday.

      ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

        Could you give an example of a regexp that would chew up all the memory on a machine? I'm utterly fascinated by this, as I was unaware you could cause recursion in a regexp.

        Or does it have to be like a gig of "(((((((((((((((((((((" to do it?

        The examples I'm seeing seem to use (??{ to build lambdas into the regs. I suspect that wouldn't apply if they were compiled at runtime -- ie, without use re eval.