Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re^3: Allowing regex entries in web form to search database: Risks or gotchas?

by Jenda (Abbot)
on Aug 10, 2022 at 20:32 UTC ( #11146087=note: print w/replies, xml ) Need Help??

in reply to Re^2: Allowing regex entries in web form to search database: Risks or gotchas?
in thread Allowing regex entries in web form to search database: Risks or gotchas?

You wrote "database" so I assumed there's a database engine, say PostgreSQL, and that's where you store the data. If it were so you could either use the regexps provided by that database engine, use Perl within that engine or fetch all the data to be searched and evaluated the expressions within the script.

It's you who defines safe and you need to decide what's safe for each individual use. The point is that instead of

if ($input =~ /something I already know is dangerous/) { die 'I refuse + to handle this!'; }
you should always write
if ($input !~ /^only stuff I know is fine$/) { die 'I refuse to handle + this!'; }

I can't give you a generic "this is unsafe" or a generic "this is safe" not knowing what happens to the $input afterwards. It's something you have to do. The thing is that it's much easier to forget to list something that's dangerous, than it is to accidentally allow something that's dangerous.

1984 was supposed to be a warning,
not a manual!

Replies are listed 'Best First'.
Re^4: Allowing regex entries in web form to search database: Risks or gotchas?
by LanX (Sage) on Aug 10, 2022 at 22:48 UTC
    I was trying to find a source supporting the "whitelisting is safer than blacklisting" approach, but alas most links revolved around the question if these terms are racist and should be replaced and by what.

    Too much a master-slave dilemma for a shady dark-pinky guy like me.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      Blacklist (blocklist, redlist) is always playing catchup.
Re^4: Allowing regex entries in web form to search database: Risks or gotchas?
by Polyglot (Hermit) on Aug 11, 2022 at 02:39 UTC

    For clarification: I don't trust MySQL/MariaDB for regex operations. The only part it plays in this is to turn over the records, after which they are searched via Perl, and any that match following the search get formatted and returned to the client's browser. The database is entirely isolated from the regex operations.

    On one hand, I might agree with your premise that one should only use what is trusted. But that word "trusted" is precisely where things get sticky. What or whom do you trust?

    If you cannot define or distinguish between what is "safe and trusted" and what is "unsafe or dangerous," then you have no validity to saying "allow only what is safe."

    For illustration, personally, I don't trust Microsoft Windows anymore, having had too many virus and security issues with it in the past. One time I was having some issues with my router and couldn't seem to get it to NAT the internet through to my PC, so I temporarily bypassed the router and hooked up directly to the DSL modem (looking for answers online to solve the router issue). I kid you not, within five minutes someone was beginning to control my computer, i.e. the mouse was moving and things were changing on screen without my input. I instantly disconnected the patch cable and never tried that again with a Windows computer. (I've done similar things with linux and MacOSX with no problem.) I mean, five minutes!

    Because Windows itself can be problematic, should one not trust it for anything? Where does one draw the line? And this is the part that you seem unwilling to attempt to define--which is why there is a weakness in your reasoning.

    There is no real-world chance of any software being 100% perfectly safe. One must, of necessity, work with a reasonable level of risk (some might use the term "manageable risk"). My original question here asked for guidance as to what the specific risk factors might be. I have had very little response, other than the CPU-crashing possibilities of wildcard use in the regex. To me, this indicates that the use of regex itself is not a big security risk, or I would have many ready to jump in with their own reports of the known risks.

    Which brings it back to the essential question: Are there any big "gotchas" with allowing regex in a search field?



Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11146087]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2022-09-25 21:53 GMT
Find Nodes?
    Voting Booth?
    I prefer my indexes to start at:

    Results (116 votes). Check out past polls.