in reply to Untaint a string match, regular expression.

As a first pass at understanding your concerns, my first thought is to simply ban any regex that contains either of the extended patterns that allow for code execution: (?{...}) and (??{ ... }).

To that end, test if the regex contains either of those patterns:

die "Regex containing code disallowed" if $userRe =~ m[\(\?\??\{];

Combine that with a check that the regex will compile: $userRe = qr[$userRe]; and it's hard to see what input, that passed those two checks, could be dangerous?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re^2: Untaint a string match, regular expression.
by Anonymous Monk on May 18, 2015 at 00:16 UTC
    :) it pretty much does that by default :)
    $ perl -e" my $re = shift; 1 =~ /$re/; " "(??{die666})" Eval-group not allowed at runtime, use re 'eval' in regex m/(??{die666 +})/ at -e line 1.

      But that is rather easily bypassed:

      C:\Users\HomeAdmin>set PERL5OPT=-Mre=eval C:\Users\HomeAdmin>perl -e" my $re = shift; 1 =~ /$re/; " "(?{die666 +})" C:\Users\HomeAdmin>

      I agree, that anything the user could supply the OPs program with from the command line, they could equally just supply to perl directly, via the command line; but that's partly why I phrased my response the way I did. Ie. Trying to tease out exactly what the OPs concerns are.

      For example, perhaps the arguments that will be supplied to the OPs program, originate from a web page interface accessible to 'external' users.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
        *giggle* and also easily unbypassed
        $ perl -le"use re 'eval'; no re 'eval'; my $re=shift; 1=~/$re/;" "(??{ +die666})" Eval-group not allowed at runtime, use re 'eval' in regex m/(??{die666 +})/ at -e line 1.

        If the user has command line access, then I see no reason from stopping them to run perl code. A good starting point is "originate from a web page interface accessible to 'external' users." The question I asked was about untainting, thus for any and all reasons one should want to untaint a string to be used to match against another string, this would include but not limited too the above.

        In this case, a Nagios module, the nrpe could be configured to allow argument passing. This gives the remote monitoring server the ability to specify any string and it may not have command line access.
Re^2: Untaint a string match, regular expression.
by cheako (Beadle) on May 18, 2015 at 00:27 UTC
    The documentation suggest that unless use re 'eval' is in scope. There is at least some protection from embedded code.