Re: Parsing a boolean search string for SQL query
by herveus (Prior) on Sep 01, 2003 at 17:05 UTC
|
Howdy!
The reason you miss the last term is because the regexp
requires the OR/AND/NOT. Put a ? after that group and it
picks up the last term (at the expense of an "use of
uninitialized value in concatenation" warning).
Alternately, you could split on whitespace and modify
the words that aren't OR/AND/NOT...see my tested code
below...
#!/usr/bin/perl -w
use strict;
my $searchstring = "aaaa OR bbbb OR cccc OR dddd";
my @words = split(/\s+/, $searchstring);
$searchstring =~ s/(\w+) ?(OR|NOT|AND)?/\(fieldname \= '$1'\) $2/g;
foreach (@words)
{
next if /^(OR|AND|NOT)$/;
$_ = "(fieldname = '$_')";
}
my $searchstring2 = join(' ', @words);
print "regex: $searchstring\nsplit/join:$searchstring2\n";
yours,
Michael | [reply] [d/l] |
|
|
This is not the first time where i have seen the split
solution be more robust than the substitute solution. I
just ran your code with the following string:
my $searchstring = "aaaa OR bbbb OR NOT cccc AND dddd";
And was happy to see the following results (formatted for
the Monastery):
regex:
(fieldname = 'aaaa') OR
(fieldname = 'bbbb') OR
(fieldname = 'NOT')
(fieldname = 'cccc') AND
(fieldname = 'dddd')
split/join:
(fieldname = 'aaaa') OR
(fieldname = 'bbbb') OR NOT
(fieldname = 'cccc') AND
(fieldname = 'dddd')
jeffa
L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)
| [reply] |
Re: Parsing a boolean search string for SQL query
by gjb (Vicar) on Sep 01, 2003 at 17:05 UTC
|
It would be much simpler to work the other way round, i.e. by replacing all query terms 'x' by 'fieldname = x'. This would mean something like:
s/(\w+)/$1 ne 'OR' && $1 ne 'AND' && $1 ne 'NOT' ? "fieldname = '$1'
+" : $1/eg;
Hope this helps, -gjb- | [reply] [d/l] |
|
|
Brilliant gjb! The incredible power of reg exp's is again demonstrated. I had no idea you could use the implied if/then/else like that. Oh, I love the Monastery.
| [reply] |
Re: Parsing a boolean search string for SQL query
by jeffa (Bishop) on Sep 01, 2003 at 17:11 UTC
|
And just exactly how are you going to account for NOT?
Remember, the syntax for NOT is something like:
SELECT * FROM movie WHERE title LIKE 'A%' AND title NOT LIKE 'Ab%'
And the fact that i chose to use LIKE instead of = says
something as well ... your interface is going to get
hairy with the approach you have chosen.
Instead of parsing some user-given string, you should
restrict their interface to lists and check boxes/radio
buttons. Then you can create your SQL statement with
something like SQL::Abstract.
jeffa
L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)
| [reply] [d/l] |
|
|
I love people who put LIKE '$user_query%' into their SQL, especially when they don't restrict the query charset and/or result set size properly. It makes data mining really easy. You just query for % and generally get a database dump.....
Not suggesting that jeffa would for a moment, just a relevant note. Here is one example where you can easily extract the database. Here is another where the lack of server side checking lets you ask for the entire database (with a correctly crafted POST) and get it. For example this snippet of HTML will dump the ENTIRE braintrack database into your browser (the thank you for attending our source code is kinda funny in context).....
<FORM action="http://www.braintrack.com/search.htm" METHOD=POST>
<INPUT TYPE="hidden" NAME="term1" value="%">
<INPUT TYPE="hidden" NAME="term2" value="">
<INPUT TYPE="hidden" NAME="term3" value="">
<INPUT TYPE="hidden" NAME= "join" value="and">
<INPUT TYPE="hidden" NAME="top" value="7000">
<INPUT TYPE=SUBMIT VALUE="Search">
</FORM>
I doubt that this is what was intended when they put the DB online. For the record I emailed braintrack.com about this issue some months ago but they have not done anything about it (or actually had the courtesy to reply, for that matter).
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
| [reply] [d/l] |
|
|
Nope. You got me:
http://dvd.unlocalhost.com/?list_m=%25&go_movie=By+Movie
That should only show movies that start with some string ...
but i did not check for the fact that the user could provide
their own GET query and specify only '%' ... of course, for
this application ... it is OK, but that still does not
excuse the potential security hole i dug.
For the record, i am rewiting that interface with
Template and Class::DBI. It's slow going,
because i have tables in the 3rd Normal Form, and Class::DBI
currently does not handle selections across tables as well
as i would like. But as soon as i am finished, i will share
it with the Monastery (i have not written a large tutorial in a while ...).
UPDATE: (for anyone interested in such)
My DVD CGI search form uses POST requests. This (in the
past) meant that web bot scripters needed to look at the
source and munge out all the parameters in order to
re-create the proper form query. I used the
Web Developer Plugin for the
Firebird browser
to literally convert POST requests to GET requests on a
loaded HTML page. Then, when you submit, the form parameters
are sent as a GET request, allowing you to cut-n-paste the
URL such as i did above. Of course, this only works if the
CGI script you are querying accepts both GET and POST
requests. Not all do.
(*cough*
*cough*)
jeffa
L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)
| [reply] |
|
|
Re: Parsing a boolean search string for SQL query
by Abigail-II (Bishop) on Sep 01, 2003 at 21:38 UTC
|
s/(\w+) ?(OR|NOT|AND|$)/\(fieldname \= '$1'\) $2/g;
should do it. But that's just for this particular example.
If you want to be able to process something more complicated,
I'd recommend using a parser. For instance, Parse::Recdescent,
or a regular expression using (??{ }) constructs.
Abigail | [reply] [d/l] [select] |
Re: Parsing a boolean search string for SQL query
by zby (Vicar) on Sep 01, 2003 at 16:36 UTC
|
#!/usr/bin/perl -w
use strict;
my $searchstring = "aaaa OR bbbb OR cccc OR dddd";
$searchstring =~ s/(\w+) ?(OR|NOT|AND)|(\w+)$/\(fieldname \= '$1$3'\)
+$2/g;
print "SELECT * FROM tablename WHERE $searchstring\n";
__OUTPUT__
Use of uninitialized value in concatenation (.) or string at a.pl line
+ 7.
Use of uninitialized value in concatenation (.) or string at a.pl line
+ 7.
Use of uninitialized value in concatenation (.) or string at a.pl line
+ 7.
Use of uninitialized value in concatenation (.) or string at a.pl line
+ 7.
Use of uninitialized value in concatenation (.) or string at a.pl line
+ 7.
SELECT * FROM tablename WHERE (fieldname = 'aaaa') OR (fieldname = 'bb
+bb') OR (fieldname = 'cccc') OR (fieldname = 'dddd')
But it is not perfect as you can see. | [reply] [d/l] |
Re: Parsing a boolean search string for SQL query
by dmitri (Priest) on Sep 01, 2003 at 20:12 UTC
|
Implementing my own parser for simple SQL-like query language was easy with Parse::RecDescent. Then you can build a nested structure with callbacks so you can even test the statement with different values.
| [reply] |
Re: Parsing a boolean search string for SQL query
by Anonymous Monk on Sep 01, 2003 at 18:47 UTC
|
| [reply] |