ctaustin has asked for the wisdom of the Perl Monks concerning the following question:

I need to pass a sql query string to obtain the column aliases. For instance
select the_date as "date", round(months_between(first_date,second_date),0) months_old ,product,extract(year from the_date) year ,case when a=b then 'c' else 'd' end tough_one from ... where...
The result I am looking for is a string like:
date,months_old,product,year,tough_one

I have tried a couple of different techniques like loading the sql statement into a string and spliting the data by ",", but this gets honked up with the months between and round functions. The other issue is, that the sql statement is generated by different people, all of whom have their own unique way of aliasing the columns. Some use "as xyz", others just use a space ".. xyz" and some actually quote it, '... as "xyz"'. Further compounding the issue is that sometimes each "column" is on its own line, other times it is not. Other times the whole column takes up more than one line by itself. Thus the attempted ugliness of the example above.

My other thought was to use a regular expression to get rid of the commas within each "column". For example, get rid of the commas in the months_between and round function, leaving me with a string that I could split by comma and take the last piece of. However, I stink at regular expressions and I continously got either a syntax error or a bad result. I thought I could do something as simple as
$sql=/(*,*)/(*)/;
Surely you must be chuckling out loud at such foley.

Any thoughts of an efficient method of obtaining these results? Regards

Replies are listed 'Best First'.
Re: parse a query string
by castaway (Parson) on Jul 06, 2005 at 05:15 UTC
    Is there any particular reason why you're not using existing solutions such as SQL::Parser ?

    If there are (?), then what you need to do is parse the thing from left to right, collecting parts as you go, instead of trying to parse it all at once. A recursive algorithm would be good, or you could just go use existing parser-creators, like Parse::RecDescent or Parse::Yapp.

    C.

      I am going to executing this through DBI. I needed the column aliases to write out to a flat file for use by a third party application. This application is very picky about column headings forcing us to use aliases. I didn't know about  $sth->{NAME_lc}, I will look into that now.
Re: parse a query string
by runrig (Abbot) on Jul 06, 2005 at 05:18 UTC
    Can I ask, "Why do you want this?" If you execute the query in DBI, you can get the column aliases with $sth->{NAME_lc}. Will you not be executing the query? Maybe SQL-Statement will parse it, but I'm not sure if it does aliases, and I'm pretty sure it doesn't do database specific things like "case" statements (though you could maybe subclass SQL::Parser to handle that).
Re: parse a query string
by Ovid (Cardinal) on Jul 06, 2005 at 05:59 UTC

    Any thoughts of an efficient method of obtaining these results?

    For the first pass, I would not recommend that you worry about an efficient method of parsing these results. A slow method that works is surely better than a fast method that doesn't, yes?

    Right off the bat, it looks like your SQL is ugly enough that you need a proper parser. I would recommend you look at Parse::RecDescent. If that turns out to be too slow, turning to Parse::Yapp or something similar might help.

    However, then I got curious and decided to write a "rough draft" of the problem for you. You'll need to clean up the output, but this should pretty much print what you want.

    #!/usr/bin/perl use strict; use warnings; my $sql = <<END_SQL; select the_date as "date", round(months_between(first_date,second_date),0) months_old ,product,extract(year from the_date) year ,case when a=b then 'c' else 'd' end tough_one from XXX END_SQL my $lparen = qr/\(/; my $rparen = qr/\)/; my $keyword = qr/(?i:select|from|as)/; # this is all this problem need +s my $comma = qr/,/; my $text = qr/(?:\w+|'\w+'|"\w+")/; my $op = qr/(?:=>|<=|[=+\-*\/])/; sub lexer { my $sql = shift; return sub { LEXER: { return ['KEYWORD', $1] if $sql =~ /\G ($keyword) /gcx; return ['COMMA', ''] if $sql =~ /\G ($comma) /gcx; return ['OP', $1] if $sql =~ /\G ($op) /gcx; return ['PAREN', 1] if $sql =~ /\G $lparen /gcx; return ['PAREN', -1] if $sql =~ /\G $rparen /gcx; return ['TEXT', $1] if $sql =~ /\G ($text) /gcx; redo LEXER if $sql =~ /\G \s+ /gcx; } }; } my $lexer = lexer($sql); my @tokens; my $nested = 0; while (my $token = $lexer->()) { $nested += $token->[1] if 'PAREN' eq $token->[0]; next if $nested or $token->[1] eq 'PAREN'; last if 'KEYWORD' eq $token->[0] && 'FROM' eq uc $token->[1]; push @tokens => $token; } foreach my $i (0 .. $#tokens) { my $token = $tokens[$i]; next unless 'TEXT' eq $token->[0]; print $token->[1], $/ if $i == $#tokens; print $token->[1], $/ if 'COMMA' eq $tokens[$i + 1][0]; } __END__ # prints out: "date" months_old product year tough_one

    I'm tired and I need to hit the sack, so I can't explain this too much. Let's just say that I turned the SQL into a series of easy to parse tokens. Then I discarded all tokens that were inside of parentheses and after the FROM clause. Then I assumed that any TEXT token at the end or before a comma is what you were looking for. This works for the sample SQL you provided, but it's very fragile. I would write lots of tests to explore edge cases.

    This code is not perfect, but it's a start.

    Note that SQL::Statement cannot be used here because SQL::Parser does not recognize CASE statements.

    Cheers,
    Ovid

    New address of my CGI Course.

Re: parse a query string
by virtualsue (Vicar) on Jul 06, 2005 at 06:47 UTC
    Do you mind saying what your overall goal is? Are you trying to analyze a pile of code? Write some sort of interface to a database?