On one hand there are all kinds of project in Perl such as Class::DBI that help you avoid writing any SQL and try to help you eliminate any database specific issues. That it, I get database independence but I pay by performance.

On the other hand I could write SQL queries and stored procedures that would make my whole application much more efficient but will tide me to one specific database.

It seems that the Perl people (and probably this is true for other programming languages too) will try to hide any SQL and make sure you are database independent. At the same time the database people will try to make sure that you are actually writing most of the business logic within their framework using their stored procedures etc.

I have been using CDBI a lot in my recent projects and works but I think I am missing some great opportunities that I don't dive deep in the capabilities of one of the databases.

How much do you use database specific features in your applications ?

Replies are listed 'Best First'.
Re: Should it be database independent?
by chromatic (Archbishop) on Mar 21, 2005 at 21:00 UTC

    Unless you're working on a project that needs database independence -- such as an open source application -- a better goal with this abstraction is encapsulation. It's much easier to work on code that has database access all in one place than strewn about.

      I agree. Unless you can use a mostly database independent layer like Class::DBI (and that depends mostly on the kind of queries you need to do), IMO the best you can do is to at least make it easy to switch databases later. You usually don't need database-independence, you just don't want to make it too hard to switch. Factoring out the SQL from the application code is probably a good idea. For instance, you can probably go a long way just by putting the SQL statements in a seperate file.

Re: Should it be database independent?
by Mugatu (Monk) on Mar 21, 2005 at 20:52 UTC

    Whether you tie your app to one specific database, or use Class::DBI, you should still have your own abstraction. It should be application specific, yet hide the details of how the data is coming into existence. That way, even if you are hard tied to one database, all the hard ties are in one spot. Most of your application will simply talk to your data layer, and your data layer will talk to DBI, or Class::DBI, or whatever.

    Update: to answer your question more directly, I usually tend to go with hard coded SQL directly against DBI. I rarely change databases, and if so, my standard abstraction layer approach saves me from most of the trouble.

Re: Should it be database independent?
by brian_d_foy (Abbot) on Mar 21, 2005 at 23:34 UTC

    I don't think actual database independence is as important as the flexibility to be independent. That is, use as many database specific features as you can to spped up your program, but don't let that lock you into that database.

    On the other hand, I don't think you should go so far into the database-independence religion that you make it impossible to get the most out of your database server (especially if you paid for all its fancy features). I know a lot of people like Class::DBI, but now that they use it, they're stuck with it. I think that can be bad too.

    --
    brian d foy <bdfoy@cpan.org>
Re: Should it be database independent?
by jZed (Prior) on Mar 21, 2005 at 20:44 UTC
    > It seems that the Perl people ... will try to hide any SQL

    Depends on your programming style. If you use straight DBI instead of CDBI, you can do whatever you want with the SQL.

    > and make sure you are database independent

    Again, that depends on your programming style. DBI is designed to provide a maximum of database independence without compromising the ability to use database specific features when needed.

      Just to be clear, CDBI does not prevent you from writing custom SQL, it just provides an abstraction layer. You can write any SQL statement you want (including complex joins, etc.) into a CDBI class and reference it from your code.

      We use CDBI as our abstraction layer, sometimes with complex SQL. The advantage here is the SQL is re-usable and all in one place. We use Oracle, but generally try to avoid 'special' Oracle features.

Re: Should it be database independent?
by samtregar (Abbot) on Mar 21, 2005 at 22:23 UTC
    Should it be database independent?

    That depends on what it is, doesn't it? For most of the things I create I don't find database portability to be worth very much. I'm roughly as likely to abandon MySQL as I am to abandon Perl!

    Maybe if the SQL standard was more usable it would be worth writing to, but frankly, it's not. There's a reason that there isn't a single database product which supports ANSI SQL and nothing else; it's not enough.

    -sam

Re: Should it be database independent?
by kimanaw (Beadle) on Mar 22, 2005 at 04:41 UTC
    This issue has become a bit of a puzzle to me of late, probably because I've been fortunate to work with robust SQL implementations, and *very* large DBMS's, for a couple decades. The notion of hiding the SQL sounds all well and good, but when I witness the sort of encapsulation it leads to, I begin to wonder if I'm the only perl hacker that regularly deals with billion+ row tables.

    While segregating the SQL from the other business logic is probably a good, idea, I'd advise caution before drinking the kool-aid of full blown OO-abstraction. If you're stuck with MySQL, Access, or some other limited SQL vernacular, I spose it makes some sense...but if your app is doing significant things with the data it returns (e.g., data mining, KD, etc.), it might be time to consider upgrading to a more robust DBMS that can support big chunks of that processing with its SQL syntax. FWIW: DBI has been mentioned numerous times here; you may find SQL::Preproc' a simpler/nimbler solution.

      You're not the only one who deals with huge tables. I haven't done the billion+ in a given table, but I have done queries against million+ JOIN million+ JOIN million+ for reports that had to return in under 5 seconds, including rendering to the user.

      If you're stuck with MySQL, Access, or some other limited SQL vernacular, . . . doing significant things with the data it returns (e.g., data mining, KD, etc.)

      That's funny ... I was doing data mining with MySQL. (That's those 1E18-row queries I was talking about before.) Access is limited, but it was never meant to be a RDBMS - that's what SQL*Server is for. MySQL may have been limited in the past, but it is a fully-functional RDBMS that supports ACID transactions and complete failover.

      . . . a more robust DBMS that can support big chunks of that processing with its SQL syntax.

      I'm curious - what's Oracle's version of LIMIT again? What about Oracle's restrictions on full-text searches? Or, how about Oracle's inability to turn ACID off when I, the developer, know I will never want it? Or, was it the overly-complicated clustering technology that Oracle itself agrees has a single failure point? Please elaborate ...

      FWIW: DBI has been mentioned numerous times here; you may find SQL::Preproc' a simpler/nimbler solution.

      It's also a source filter. Source filters have a few drawbacks:

      • Slower than a non-filter - they have their own parsing section which is run in Perl
      • Can make mistakes - Only perl can parse Perl, which includes non-Perl stuff
      • May not work in a persistent environment correctly - this includes mod_perl

      Personally, I drink the "Separation-Of-Concerns" KoolAid™ and keep my Perl and SQL separate. This means using either Class::DBI (if I want an OO abstraction) or DBI (if I want performance).

      Being right, does not endow the right to be rude; politeness costs nothing.
      Being unknowing, is not the same as being stupid.
      Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
      Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

        You're not the only one who deals with huge tables. I haven't done the billion+ in a given table, but I have done queries against million+ JOIN million+ JOIN million+ for reports that had to return in under 5 seconds, including rendering to the user.

        Unless those were outer joins, the number is likely much less than 1E18 (I'm assuming MySQL's optimizer is smart enough to minimize join processing). Did they include any range joins on non-index columns ?

        That's funny ... I was doing data mining with MySQL. (That's those 1E18-row queries I was talking about before.) Access is limited, but it was never meant to be a RDBMS - that's what SQL*Server is for. MySQL may have been limited in the past, but it is a fully-functional RDBMS that supports ACID transactions and complete failover.

        I'll take your word for it, tho my review of their SQL syntax leads me to believe they still have some way to go. (I've obviously stepped into the MySQL vs. the World hornet's nest, I should know by now not to bring it up..)

        I'm curious - what's Oracle's version of LIMIT again? What about Oracle's restrictions on full-text searches? Or, how about Oracle's inability to turn ACID off when I, the developer, know I will never want it? Or, was it the overly-complicated clustering technology that Oracle itself agrees has a single failure point? Please elaborate ...

        Hmm, since I never mentioned Oracle in my post, I'm not certain what question I'm being asked to elaborate on?

        It's also a source filter. Source filters have a few drawbacks:

        * Slower than a non-filter - they have their own parsing section which is run in Perl

        A relative factor. If the application is just running "SELECT NOW()", you're quite right. If however, you're doing a large scale complex query, or multiple queries, the net impact of the source filtering is likely much less than the actual database processing time. And, of course, if run under a persistent environment, the source filtering happens during the load, not runtime. Since SQL::Preproc just emits DBI calls, runtime performance should be comparable to a DBI app.

        * Can make mistakes - Only perl can parse Perl, which includes non-Perl stuff

        Hmmm, last time I checked, perl can make mistakes too (thats part of the reason the release numbers have that 3rd digit). For that matter, I've heard that perl programmers occasionally make mistakes. So, based on your logic, I guess we all should stop programming in perl...or perhaps altogether.

        * May not work in a persistent environment correctly - this includes mod_perl

        Can't answer that one, except to say I don't see anything in SQL::Preproc that persists anything the application can't override.

        SQL::Preproc can make some things quicker/simpler to implement. Its intended as a "simple things should be easy, and hard things should be possible" solution for developers that are comfortable with SQL.

        Personally, I drink the "Separation-Of-Concerns" KoolAid™ and keep my Perl and SQL separate. This means using either Class::DBI (if I want an OO abstraction) or DBI (if I want performance).

        A matter of preferences, then. Some folks prefer a little mustard on their hot dogs, and some of us prefer a little hot dog with our musard. I guess the point I was originally trying to make was that I've seen many instances where separation/encapsulation led to throwing an object wrapper around a cursor on a large table, and processing every returned row on the client, in perl (or Java or C++, or whatever), and ignoring the capability of the underlying DBMS and/or its SQL syntax (MySQL included!). Once those little wrappers get written - and esp. if they're written to a least common denominator SQL -, they tend to take on a life of their own, and opportunities for exploiting powerful DBMS features get ignored or shut out.

        I'm curious - what's Oracle's version of LIMIT again?

        Tom knows

        /J

Re: Should it be database independent?
by Anonymous Monk on Mar 22, 2005 at 11:36 UTC
    Hard to say. It very much depends on your goals. If you write an application that you want to sell, or have used, by as many people as possible, and your main use of the database is to store a bit of data, it makes sense to go for database independance. OTOH, if you are writing a specific application to be run by your company, and your company only, and your company has been using Oracle for hundreds of applications for the past 20 years, it's not very likely there will be a need to run your application on DB2 anytime soon.

    It's similar to the question how portable Perl applications should be. It always involves trade-offs. If it's more portable, it can be deployed on more platforms. But it might mean the code is more complex, or less efficient. How much more efficient will depend on your data, and how you have of it. If being more portable means your queries that twice as much (because you can't use a cool database dependent feature), it matters if it means that the queries now takes 0.2 instead of 0.1 seconds, or 10 minutes instead of 5.

Re: Should it be database independent?
by johndageek (Hermit) on Mar 22, 2005 at 13:33 UTC
    Our corporation has selected Oracle, so we go with hard coded SQL to gain database efficiency. The odds of changing databases is very slim for us. However, if you are writing code for an external market, database independence would be more important.

    Enjoy!
    Dageek

      Famous last words ;-)

      I'm sure Peoplesoft thought that, once they had selected (partnered with) IBM DB2, the odds of changing to Oracle was very slim. ;-)

Re: Should it be database independent?
by skyknight (Hermit) on Mar 24, 2005 at 01:51 UTC

    There's no one answer here, except perhaps for "it depends". Use whatever tool is best suited to getting the job done that you need to do while simultaneously trying to be sufficiently forward looking that you do not paint yourself into a corner. Sounds easy, doesn't it?

    For most small projects, you probably don't need the high octane solution of hand crafted SQL. As such, pick a good framework and exploit it's abstractions to your advantage. If your requirements start bumping up against the limitations of its feature set, then try looking for another framework that provides the features you need, or perhaps consider contributing to that framework by augmenting its feature set. Of course, the latter might not be an option, as one of the features of a framework may be that it is compact. Simplicity itself can be a feature, and adding too much to a framework can be just as harmful as leaving certain things out.

    Of course, even if you decide to implement a roll-your-own solution, you'll probably end up creating a framework of sorts for yourself. You'll see all sorts commonality between various tasks and want to create an abstraction. Beware feature creep.