in reply to Site Search perlscript and security

Just for starters, ??{ . . . } can be used to execute code. That would be bad. There's also nothing stopping future versions of Perl to add some other way to execute code or do other nasty things if you let the search string go through. So even if you're safe now, you might not be safe when you upgrade Perl a few years from now.

You absolutely need to have a deny-by-default policy. Run the string through something like this before searching:

if( $search !~ /\A ([A-Za-z0-9 ]+) \z/x ) { print "Error, can't run search\n"; }

You can add in as many characters as you need, but I suspect most searches don't need anything more than ASCII upper- and lower-case, numbers, and a space.

"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Replies are listed 'Best First'.
Re^2: Site Search perlscript and security
by steelrose (Scribe) on Nov 29, 2005 at 17:49 UTC
    This is exactly what my concern was (though admittedly, I didn't realize you could actually run code in a regex...)

    It's just a simple text search, so I should have success using a form of your if statment. I think what I'll do though is strip the string of any non A-Z a-z 0-9 and space, then use that string to feed the regex.

    I'll play around with it (I'm still learning about regex's) and see what I can come up with. In the mean time, if anyone has a good solution and wants to post it, I'll check back later to see how my solution compares. Thanks.

    If you give a man a fish he will eat for a day.
    If you teach a man to fish he will buy an ugly hat.
    If you talk about fish to a starving man, you're a consultant.
Re^2: Site Search perlscript and security
by BUU (Prior) on Nov 30, 2005 at 00:02 UTC
    Just for starters, ??{ . . . } can be used to execute code. That would be bad. There's also nothing stopping future versions of Perl to add some other way to execute code or do other nasty things if you let the search string go through. So even if you're safe now, you might not be safe when you upgrade Perl a few years from now.
    Just to note, this is only true if you specifically enable it via use re 'eval';. As for perl suddenly allowing a new way to interpolate code in to a regex and execute it, this is really rather unlikely. If they do, it will definitely only work with a new, specific switch.
Re^2: Site Search perlscript and security
by steelrose (Scribe) on Nov 29, 2005 at 21:06 UTC
    So, my solution:

    $string =~ s/((?![\w,\s])|(?=[_,\,])).//g;

    then do the match using the string data. And of course, print a disclaimer on the page with the text box for the users that special characters will be ignored ;)

    If you give a man a fish he will eat for a day.
    If you teach a man to fish he will buy an ugly hat.
    If you talk about fish to a starving man, you're a consultant.

      IMHO, \w and \s are too liberal in what they accept. Chances are that your search will not need Unicode, and \w in particular is going to accept that if your perl has Unicode support. Unless you know you need Unicode, it's probably better to use the explicit character class [A-Za-z0-9].

      "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

        A good point. Since the data I plan to search contains very few non A-Z a-z 0-9 characters that would need to be searchable, I can just add those characters to the string (like the e with acute é mark)

        If you give a man a fish he will eat for a day.
        If you teach a man to fish he will buy an ugly hat.
        If you talk about fish to a starving man, you're a consultant.