deprecated has asked for the wisdom of the Perl Monks concerning the following question:
<!- i use postgres. you should too. -->
hi, monks.
i've been using dbi a whole lot lately. im writing usually upwards of 700 lines
of code per week. anyone who's coded with dbi knows that its kind of hard to write
code that isnt fraught with unmaintainable peril.
i guess the problems come in because of all the references and abstractions used.
here are a few examples of stuff I find difficult to code in dbi in a manner that
is clearly readable by novice programmers (many of the people I work with) and also
not so cluttered that intermediate programmers can read it without too much pain.
inserts with $sth
my $sth = $dbh -> prepare("insert into table1 (col1, col2, col2) value
+s (?, ?, ?)");
my beef here is that the line winds up being so long. risacher and i have discussed
style and spacing. i used to think that horizontal space wasnt such a premieum, but
vertical space was. this was when i was using a 21" monitor. i'm now using a 15" laptop
screen. the above is a very simple query, and still comes out at 84 characters. you
can see how cramped lines this long can be in this screen sh
ot.
This problem is even further complicated by using descriptive variable names. Let's
look a little further at this:
# do we need an insert or an update?
my $row_checker = $dbh -> prepare("select count(col1) from table1 wher
+e col1 = ? and col2 = ? and
col3 = ?");
my $inserter = $dbh -> prepare("insert into table1 (col1, col2, col2)
+values (?, ?, ?)");my $upda
ter = $dbh -> parepare("update table1 set col1 = ?, col2 = ?, col3 = ?
+ where col1 = ? and col2 =
? and col3 = ?");
as you can see, this gets ugly in a hurry. so i've been doing two things when this
starts to bother me and set off my ugly-sensors.
my $row_checker = "select count(col1) from table1 where col1 = ? and c
+ol2 = ? and col3 = ?";
$row_checker = $dbh -> prepare($row_checker);
# or ...
my $updater = $dbh -> prepare(qq{
update table1 set col1 = ?, col2 = ?, col3 = ?
where col1 = ? and col2 = ? and col3 = ?
});
i still don't really like either of these. the first one is more compact for smaller
queries, but still doesnt help for larger queries. the second one is okay, but i dislike
having the }); sitting all by itself on another line. but it is much
better than the original, imho.
pulling out data from the database
so lets say you want to pull a single tuple out of the database. this is something
most of us do quite frequently. even on a huge database like the one at work (terabyte
scale), i still do want one tuple pretty frequently. here's what the dbi pod would
have you do:
my $rowRef = $dbh -> selectrow_arrayref("select col1 from table1 where
+ col1 = col2 and col2 != '$
someval'");
my @row = @{ $rowRef };
#or ...
my $tuple = shift @{ $rowRef };
#or ...
my $tuple = @{ $rowRef }[0];
there are some problems with this for people who use warnings and/or strict. if your
reference comes back empty (because your query failed), your whole script dies with the
ever popular "cant use an undefined value as an arrayref at oops.pl line 100", which
sucks. i dont want my program to crash because my query failed, i'd rather trap the
error and report it to the user (and/or the developer)! instead, i've been using this:
my ($tuple) = map { @{ $_ } } $dbh -> selectall_arrayref("select col1
+from table1 where col1 = co
l2 and col2 != '$someval'");
of course you can use trickery like $query or selectall with a multiline qq{}, but i
like the above method
because then I can do:
if ($tuple) { # succeeded...
and i dont have to play around with if ref $tuple ... which I find rather irritating
.
now, that having been said, i generally prefer to not use $dbh when I can use a statement handle
instead.
my $next_col1 = "select distinct(col1) from table1";
$next_col1 = $dbh -> prepare($next_col1);
$next_col1 -> execute();
my @col1s = map { @{ $_ } } @{ $next_col1 -> fetchall_arrayref() };
foreach my $col1 (@col1s) { ...
in general, if youre going to be executing a query a lot of times, a $sth is a better
bet than just using $dbh. its faster, and you let the DBD module take care of quoting
and whatnot. it also gives you more meaningful errors if you call it with too many
bind values or not enough.
i'm sure I have more syntactic tricks up my sleeve that i'm not remembering right now.
the purpose of this post, however, was to see if anyone was doing anything with dbi's
refs of refs of refs of (gack!)... and making it easier to read and/or write.
happy new year and stuff.
brother dep.
--
Laziness, Impatience, Hubris, and Generosity.
Re: dbi style questions (code, discussion)
by talexb (Chancellor) on Dec 29, 2001 at 08:02 UTC
|
my $sth = $dbh -> prepare("insert into table1 (col1, col2, col2) value
+s (?, ?, ?)");
This wraps a little ugly; I'd change it to
my $sth = $dbh->prepare(
"insert into table1 (col1, col2, col2) values (?, ?, ?)");
That way, the entire SQL statement is on a line by itself. A little lower you have
my $updater = $dbh -> prepare("update table1 set col1 = ?, col2 = ?, c
+ol3 = ? where col1 = ? and col2 = ? and col3 = ?");
I'd change that to
my $updater = $dbh -> prepare(
"update table1 set col1 = ?, col2 = ?, col3 = ? \
where col1 = ? and col2 = ? and col3 = ?");
That makes the variables all line up (I know, it's not to everyone's taste.) Although I ponder at the logic of that statement .. if the row values are already set to those values, what good is setting them to those same values? Usually when I do an update, I get an id value for the row that I'm going to change, and do something like
my $updater = $dbh->prepare(
"update table1 set col1 = ?, col2 = ?, col3 = ? where id = $ID");
Generally I believe you want to use placeholders like ? when you are going to be inserting or updating many rows. For one-offs it isn't really necessary.
In general style terms, I try to stick with an 80 character limit on line lengths. I may have a big monitor now, but maybe later I'll have to look at my code on my client's 14" screen in character mode. Then I be lookin' stoopid.
Make your code as readable as possible -- if you can't read your own code in three months time, what chance does anyone else have?
--t. alex
"Excellent. Release the hounds." -- Monty Burns.
| [reply] [d/l] [select] |
(dkubb) Re: (1) dbi style questions (code, discussion)
by dkubb (Deacon) on Dec 29, 2001 at 11:34 UTC
|
I find the messy-ness that you speak of to be a
common occurence when other languages are intermixed inside
perl. Compared to normal perl, SQL or even HTML breaks up
the flow and can cause confusion with novice and experienced programmers alike.
However, I do think it's possible to write clean DBI code, and simplify on some of the methods you use to
fetch data from the database handle.
I agree with you splitting up the SQL query onto
multiple lines is a better idea. I also capitalize the
SQL keywords/commands, and lower case the columns and
table names:
my $sth = $dbh->prepare(q{
INSERT
INTO table
(col1, col2, col3)
VALUES (?, ?, ?)
});
Style is very subjective, I'd say work what's best
for you, and most importantly, be consistent. The novice programmers you work with will learn any style you choose, as long as you/they aren't changing your mind every few weeks =)
Along with SQL formatting conventions, I follow a few
simple naming conventions when working with DBI:
- All statement handles are prefixed with $sth_, as in $sth_get_customers if I am using more than one, otherwise I just use $sth.
- If I am using more than one database handle, then
I prefix all of them with $dbh_, as in $dbh_accounting. Otherwise I just use $dbh.
All of the above conventions won't really do much good
if you have to jump through hoops to get the data out
of the "fetch" commands. =) Luckily, there are simpler
ways to pull data from the database.
IMHO, the best way to pull a tuple from a data source is with DBI::selectrow_array:
my $tuple = $dbh->selectrow_array(q{
SELECT col1
FROM table1
WHERE col1 = col2
AND col2 = ?
LIMIT 1
},
{},
$someval,
);
I agree with your position that using statement handles is faster (especialy with some databases that can prepare a query plan, like Oracle) when you are querying over and over, when compared to using DBI's utility methods. But, there is one exception that alot of people may not know: you can prepare a statement and pass that off to a utility method, not losing any speed, and cutting a few lines of code out. Consider the following:
my $sth_get_id = $dbh->prepare(q{
SELECT id
FROM contact
WHERE email = ?
LIMIT 1
});
foreach my $email (@emails) {
my $id = $dbh->selectrow_array($sth_get_id, {}, $email);
print "$email -> $id\n";
}
All of DBI's utility methods can accept statement handle
rather than a straight SQL query as the first parameter.
Also, if you want to fetch multiple rows, but
want only the value from the first column of each row, DBI has a built in method called selectcol_arrayref
that will do exactly that:
my $emails = $dbh->selectcol_arrayref(q{
SELECT email
FROM contact
WHERE id BETWEEN ? AND ?
},
{},
1,
100,
);
foreach my $email (@$emails) { ...
| [reply] [d/l] [select] |
Re: dbi style questions (code, discussion)
by markjugg (Curate) on Dec 29, 2001 at 21:23 UTC
|
A couple tips:
First: Regarding ugly insert and update statements, I solve this problem by using DBIx::Abstract with it's insert() and update() functions:
use DBIx::Abstract;
# get started using existing $DBH handle
my $db = DBIx::Abstract->connect($DBH);
# insert from a hash, auto-quoting along the way
$db->insert('table_name',\%hash_ref);
# update from a hash, auto-quoting for you
$db->update'table_name',\%hash_ref,'item_id = 3');
Very handy. I don't really use the rest of the DBIx::Abstract interface. I prefer straight DBI most of the time. One plus to this system: DBIx::Abstract can handle passing in constants that don't get quoting, like CURRENT_DATE, for example. See it's docs for that exact syntax. One minus to this system: It doesn't use DBI's prepare_cached() method, one of my new favorite DBI tricks for an easy performance boost.
Second: In many cases you can combine the the flexibility of $sth methods, like using prepare_cached() and placeholders, with the simplicity of the $DBH calling style, like this:
$sth->prepare_cached("SELECT * from t where id = ?");
$DBH->selectall_arrayref($sth, {}, @bind_values);
-mark | [reply] [d/l] [select] |
Re: dbi style questions (code, discussion)
by BazB (Priest) on Dec 29, 2001 at 15:13 UTC
|
At this point, I've only used a combination of Tcl and SQL to query a database.
IMHO, I think it's irrelevant which languages you're using, things are really going to get confusing if you've got a mix of scripting, markup and SQL unless you format everything to the point of excess.
Have a look at Philip Greenspun's SQL for Web Nerds, particularly the style section.
It's all Oracle8i stuff, but it should apply to anything with SQL.
When I've coded scripts with large SQL queries, I often put the query into a string:
my $query = "SELECT post, post_time, user_id, post_id
FROM my_table
WHERE post_time > sysdate + 1
ORDER BY post_id;";
Then query the database with the command required (can't do it in Perl yet :-) with $query in that command instead of a lump of SQL and scripting/Perl mashed together.
Even if you don't like my idea of having the query seperate, or it's impossible, then just get used formatting the queries so that they stand out from the Perl and are readable as SQL.
Hope that helps,
BazB.
Update: Seems like a slight more Perl-y way of doing this might be to use HERE documents, as described in this node by steves.
Further update: fixed the link to SQL for Web Nerds
| [reply] [d/l] [select] |
Re: dbi style questions (code, discussion)
by edebill (Scribe) on Dec 29, 2001 at 22:39 UTC
|
You're right that it gets ugly fast. Like you, I've been doing a lot of DBI work, and I've found that the bigger the SQL statements, the uglier it gets. Here's what I came up with
Starting with the "everything in a line" technique, but showing more columns/values
my $bsql = "insert into messages(messageid, quoteid, subject, body, viewed, toid, fromid, timestamp_c, timestamp_m) values($next_messageid, $quoteid, $q_subject, $q_body, 0, $toid, $fromid, CURRENT TIMESTAMP, CURRENT TIMESTAMP)";
How long does it take to make sure all your columns names and values line up? What happens the next time you need to add a column? I get lost pretty quick looking at that style.
Now, let's try it in a vertical layout:
my $sql = "insert into messages (
messageid,
quoteid,
toid,
fromid,
subject,
body,
viewed,
timestamp_c,
timestamp_m
) values (
$next_messageid,
$quoteid,
$toid,
$fromid,
$q_subject,
$q_body,
0,
CURRENT TIMESTAMP,
CURRENT TIMESTAMP)";
OK, so you have a little better hope of adding/removing columns without breakage. And people are generally better at counting lines than counting words (for figuring out exactly which column a value goes with). C-k C-k C-y in emacs (or the vi equivalent) can move columns,values around very efficiently. I definitely like this one better (even though it grows off screen pretty quick).
Now, for a third way. How about something like:
my $msgdata = { messageid => $next_messageid,
quoteid => $quoteid,
toid => $toid,
fromid => $fromid,
subject => $q_subject,
body => $q_body,
viewed => 0,
timestamp_c => "CURRENT TIMESTAMP",
timestamp_m => "CURRENT TIMESTAMP" };
my $sql = "insert into msg (";
my $values = ") values (";
my $started = 0;
foreach my $datum (keys(%$msgdata)) {
if (length($msgdata->{$datum}) > 0) {
if ( $started ){
$sql .= ",\n $datum";
$values .= ",\n $msgdata->{$datum}";
} else {
$started = 1;
$sql .= " $datum";
$values .= " $msgdata->{$datum}";
}
}
}
$sql .= $values;
Now, before you reach for that barf bag, take another look. This third way does some things the first 2 don't. - It checks to make sure there's something there. If one of the values is undefined, it will still generate valid SQL.
- You always know exactly which column a given value goes with.
- In fact, everything after creating the hash can be put into a nice little sub so you don't have to look at the foreach.
It takes some getting used to, but I find the third option a lot more maintainable.
Also notice I'm always creating a scalar to hold the SQL, never putting it right into the prepare(). That's to make it easier to print(). I find that I have to do that so often I might as well just plan on it, no matter which style I'm making the SQL with.
SQL is an ugly language (why does update have nice name=value syntax, while insert doesn't?) and it just gets uglier as you add code to $dbh->quote() all your values and wrap things in the eval{} blocks that make your code robust enough to handle a failed transaction. Automatically generating the SQL at least cuts out some typo problems. | [reply] [d/l] [select] |
|
Nice idea. I've some simple improvements to make it more
human friendly.
# some sample data.
my %msgdata = ( messageid => $next_messageid,
quoteid => $quoteid,
body => $q_body,
viewed => 0,
timestamp_m => "CURRENT TIMESTAMP" );
# you can do a sort here if you care about the order
# of the keys.
my @keys = keys %msgdata;
# uncomment this if you want to exclude all pairs
# that have no value.
#@keys = map {defined($msgdata{$_})? $_ : ()} @keys;
# get all my values in the same order as my keys.
my @values = @msgdata{@keys};
# replace any undefined values with "NULL".
# Comment this out if you're excluding these above.
# Note: if you're using $dbh to quote your values
# then use $dbh->quote($_) for the true option
@values = map {defined($_)? $_ : "NULL"} @values;
# my sql statement.
my $sql = "INSERT INTO msg ( " . join (",\n", @keys) . ") " .
"VALUES " . join (",\n", @values);
# or with DBI: (but not so good for printing)
# you don't need to use the $dbh->quote above if
# you use this.
my $sql2 = $dbh->do(
"INSERT INTO msg ( " . join (",\n", @keys) . ") " .
"VALUES " . substr(("?,\n" x @values), 0, -2),
undef,
@values);
I like DBI. But a friend who doesn't like his SQL to mess
up his code puts all of the queries into a module and then
calls them functionally. eg:
insert($table, @keys, @values);
# or possibly
insert($table, $data_ref);
# and definately
$data_ref = select($table, $conditions_ref, $desired);
That works as well of course.
Jacinta
| [reply] [d/l] [select] |
|
# my sql statement.
my $sql = "INSERT INTO msg ( " . join (",\n", @keys) . ") " .
"VALUES (" . join (",\n", @values) . ")";
I like it. Using join() this way simplifies the SQL generation a lot.
Unfortunately, there's no real help for manually $dbh->quote()ing values that need it - we use DB2 at work, and quoting a number gives a SQL error (grrrrrrrr). That's my #1 complaint about the thing.
We'll probably move to something between what your friend does, and the inline code. Generate the actual SQL in a module, but maintain control of it's execution inline - otherwise you're borked with transactions, and I hate passing $dbh's around all over the place. | [reply] [d/l] |
|
|
|
Re: dbi style questions (code, discussion)
by guha (Priest) on Dec 29, 2001 at 15:39 UTC
|
I mostly use something like my $sqlstmt = <<SQL_A;
select count(col1) from table1
where col1 = ? and col2 = ? and col3 = ?
SQL_A
If the SQL statement gets really big I would consider wrapping it into a sub, aka my $sqlstmt = def_sql_a;
sub def_sql_a {
return <<SQL_A;
select count(col1) from table1
where col1 = ? and col2 = ? and col3 = ?
SQL_A
}
This is also consistent with hiding the gory details in a sub when creating "dynamic" SQL statements. | [reply] [d/l] [select] |
Re: dbi style questions (code, discussion)
by uwevoelker (Pilgrim) on Dec 29, 2001 at 23:30 UTC
|
At first I would like to thank markjugg for the tip with DBIx::Abstract.
I usually use the other INSERT syntax (similar to UPDATE, with SET).
Here comes an example:
sub db_store_event {
my $id = shift || 0;
my $ret = $id;
if ($id > 0) {
# update
my $hash = db_get_event($id);
my $event = impand_event(); # konvertiere Uhrzeit und Datum
my @update = ();
foreach my $key (keys %$hash) {
if (defined $event->{$key} and ($hash->{$key} ne $event->{
+$key})) {
push @update, $key.'='.$dbh->quote($event->{$key});
}
}
if (scalar @update > 0) {
db_do("UPDATE ".$config->{event_table}." SET ".join(', ',
+@update)." WHERE id=$id");
}
} else {
# insert
my $event = impand_event(); # konvertiere Uhrzeit und Datum
my @dat = ();
foreach my $key (keys %$event) {
push @dat, $key.'='.$dbh->quote($event->{$key}) if $event-
+>{$key};
}
my ($err, $sth) = db_sth("INSERT INTO ".$config->{event_table}
+." SET ".join(', ', @dat));
$ret = $sth->{'mysql_insertid'} or die "no db_id";
}
return $ret;
}
The routine impand_event() konverts the cgi parameter to an hashref with keys like the fields in the database and does some conversion (for date and time).
sub impand_event {
my ($datum, $zeit);
my %event = ();
foreach my $key ($cgi->param) {
if ($key =~ m/event\.(\w+)/) { $event{$1} = $cgi->param($key);
+ }
}
$datum = $event{datum_beginn};
if ($datum) { $event{datum_beginn} = date_to_sql($datum); }
$datum = $event{datum_ende};
if ($datum) { $event{datum_ende} = date_to_sql($datum); }
$zeit = $event{zeit_beginn};
if ($zeit) { $event{zeit_beginn} = time_to_sql($zeit); }
$zeit = $event{zeit_ende};
if ($zeit) { $event{zeit_ende} = time_to_sql($zeit); }
return wantarray ? %event : \%event;
}
Sorry for the german comments, but these are real lines of code.
uwevoelker | [reply] [d/l] [select] |
Re: dbi style questions (code, discussion)
by Anonymous Monk on Dec 29, 2001 at 20:37 UTC
|
I do a lot of DBI programming on a daily basis and I still haven't come up with something I am 100% happy with. Most of the SQL statements I use are long multiline ones, which I like the form:
my $sth = $dbh->prepare(q{
SELECT x, y, z FROM table1, table2
WHERE x = ?, y = ?
ORDER BY z
});
I think making the SQL commands uppercase helps them stand out from the code a bit better. If it's a shorter statment I try to squeeze it on to one line. I also use prepared statements for most everything, apart from 1 shot updates, deletes etc.
$dbh->do('delete from temp where x = ?', undef, $x);
According to the DBI docs, using bind_columns is the way to go, it means you can write code like:
$sth->execute;
$sth->bind_column(\$res);
$sth->fetch;
$res would be undef if there was an error.
Another alternative is:
$sth->execute;
unless (@cols = $sth->fetchrow_array) {
# error
}
Hope this gives you a few ideas.
| [reply] [d/l] [select] |
|
Dammit, forgot to log on.
Doh!
| [reply] |
Re: dbi style questions (code, discussion)
by IlyaM (Parson) on Dec 30, 2001 at 01:43 UTC
|
| [reply] |
Re: dbi style questions (code, discussion)
by Ryszard (Priest) on Dec 30, 2001 at 11:35 UTC
|
If performance isnt an issue, one trick i use is to read the sql in from a file.
another trick i use to line up my sql and make it look nice in the code (when i embed it) is concatenation
my $sql .= "rest of statement"I also package up my dbi stuff into a couple of subbies, one for selecting, and one for insert, update, delete, and do all the appropriate error checking (ie DB alive, uname/pwd ok et al). I also pass all the params as anon hashes. Its not amazing code, buts its functional, and works for me. It also means, once i've written my dbi subbies, i dont need to go about writing dbi stuff again, just remember the parameters that must be parsed.. :-)
| [reply] [d/l] |
|
my $sql = <<"END_SQL";
SELECT name, address, phone_number, order_id
FROM customer, customer_order, order_item
WHERE blah blah blah
END_SQL
That makes it readable and it looks pretty nice when you
log it, etc. Here documents: another reason Perl rocks.
| [reply] [d/l] |
|
|