LanX has asked for the wisdom of the Perl Monks concerning the following question:
I'm looking for a clean ways to intercept SQLs prepared/executed by DBI.pm and log the caller trace of certain expressions (uses of subqueries with ORDER-BY clauses).
The idea is to be alarmed about critical code dynamically.
I'm thinking of using an regex to find all queries where SELECT is not at the start and ORDER or GROUP are used and write the caller chain to a log-file.
SQL::Parser could be used for an in detail introspection then.
Before reinventing the wheel I'd like to ask for your opinion and ready to use solutions to monitor "bad" SQL.°
readmore for the SQL background problem:
My client is confronted with some inconsistencies in MariaDB 10.6 which boils down to the fact that the SQL standard doesn't allow a subqueries to have an ORDER BY clause.
MariaDB allows ORDER-BY but (often) ignores them as long they are not LIMITed. No error is thrown if the LIMIT is missing.
A large part of our code-base was written with MariaDB 5.X when this wasn't an issue.
One of my products uses an emulation of ROW_NUMBER() written similar to this one in a subqueries inside an INSERT or REPLACE statement. After updating we noticed that this randomly started to fail, but I was able to fix that specific code with teh actual ROW_NUMBER() which is available with other "window functions" since MariaDB 10.2
But we are concerned that other subqueries are effected and nailed it down to this bug-report
A "table" (and subquery in the FROM clause too) is - according to the SQL standard - an unordered set of rows. Rows in a table (or in a subquery in the FROM clause) do not come in any specific order. That's why the optimizer can ignore the ORDER BY clause that you have specified. In fact, the SQL standard does not even allow the ORDER BY clause to appear in this subquery (we allow it, because ORDER BY ... LIMIT ... changes the result, the set of rows, not only their order).
You need to treat the subquery in the FROM clause, as a set of rows in some unspecified and undefined order, and put the ORDER BY on the top-level SELECT.
It beats me why Maria-DB is not throwing an error if the LIMIT is missing in an "ordered" subquery. ²
I also identified a global setting for concurrent inserts which defaults to AUTO and might cause INSERT-SELECTs to ignore order in subqueries.
This is in contrast to MySQL REFERENCE documentation which recommends explicitly using ORDER with INSERT-SELECTs to guaranty some replication scenarios to work. oO
The order in which a SELECT or TABLE statement with no ORDER BY clause returns rows is nondeterministic. This means that, when using replication, there is no guarantee that such a SELECT returns rows in the same order on the source and the replica, which can lead to inconsistencies between them. To prevent this from occurring, always write INSERT ... SELECT or INSERT ... TABLE statements that are to be replicated using an ORDER BY clause that produces the same row order on the source and the replica. See also Section 17.5.1.18, “Replication and LIMIT”.
It's hard to come by a clean SQL-documentation or even SQL standards, but this quote seems to indicate that a query inside INSERT-SELECT is NOT considered a subquery and hence should be exempt from the order exception above(?)³
I'm planning to discuss this on an SQL board on Reddit or Stack-Overflow, just didn't want to keep you in the dark about this mess.
°) found this on SO how-to-intercept-queries-in-a-subclassed-dbi
²) compare how MS-SQL is handling that: ORDER BY can only be specified when TOP is also specified.
³) or is this a fuzzy way to say to always combine a LIMIT with ORDER-BY to allow replication to stay reliable?
Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Intercepting critical SQL inside DBI.pm
by kikuchiyo (Hermit) on Sep 16, 2022 at 09:15 UTC | |
by LanX (Saint) on Sep 16, 2022 at 13:12 UTC | |
by Corion (Patriarch) on Sep 16, 2022 at 15:39 UTC | |
by LanX (Saint) on Sep 16, 2022 at 16:18 UTC | |
|
Re: Intercepting critical SQL inside DBI.pm
by LanX (Saint) on Feb 22, 2023 at 14:36 UTC |