in reply to Re^2: No stored procedure bashing on my watch!
in thread Recoding a multi-sql-statement storedProc transaction in a script

I don't buy it. Change the data structures around in a significant way and hide it all with stored procs? You'd end up with something messy and slow, just like all of those awful "we'll fix it with a view" solutions.

Many DBAs are reluctant to give out permission for developers to modify stored procs. They see it as a way to make sure that all SQL is filtered through them. The problem is, they tend not to have any time to look at your SQL either, so the vetting process becomes a bottleneck.

That sort of thing can be fixed, albeit slowly and painfully, but it doesn't change the fact that stored procs normally use crippled languages. I wouldn't want to use them for anything with actual programming logic in it.

  • Comment on Re^3: No stored procedure bashing on my watch!

Replies are listed 'Best First'.
Re^4: No stored procedure bashing on my watch!
by BrowserUk (Patriarch) on Dec 16, 2004 at 22:04 UTC

    Anything you can do from embedded SQL, you can do from a stored procedure. Instead of preparing and executing the SQL, you invoke the SP that does it for and returns the same results.

    No extra programming logic needs to be written in the SP language. It can be, often saving the need to ship large volumes of data between the DBM and the application program, and then back again, but it doesn't need to be.

    The difference is that that when the schema changes, the application program(s) don't. When a new schema is being developed, the SPs get modified in step and tested in isolation of any applications. When the switchover comes to pass, the existing DB is renamed and the new DB renamed to the old name. Next connection, all the applications using SPs are automatically using the new schema without a single line of their code needing to change.

    It is exactly the same process of encapsulation that using objects gives you: OO-101: Tenent 1: Isolate the data format from the caller.

    It also provides for an isolation layer between the application code and the DBM vendor. Provided the new DBM vendor also supports SPs and the DBI interface allows access to them, then you simply write the equivalent SP with the same name in the new vendor SP language and again, the applications don't need any modification.

    It also allows the SPs to make use of vendor specific optimisations and extensions without fear of lock-in.


    Examine what is said, not who speaks.        The end of an era!
    "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
    "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
      I'm aware that stored procs don't have to include programming logic, but they often do.

      Your point about isolation from vendor-specific SQL makes a lot of sense, although I wonder how standardized the details of calling procedures is across databases in DBI. Typically, database portability only really matters to people who are shipping a product that must support multiple dbs. Other people are not willing to spend the time and money that portability requires.

      I still don't buy the idea that stored procs can provide protection from schema changes of any significance. If there was a good reason to make the change, it must mean the available data is now different, which means the programs using it need to change. Little things like adding columns don't break SQL -- big things like merging or splitting tables and changing relationships do. Trying to hide changes like that behind stored procs and let programs continue as if nothing happened will surely result in slow and complicated joins and the like.

        If there was a good reason to make the change, it must mean the available data is now different, which means the programs using it need to change.
        No, that's the point. Likely some programs will change (or new programs written) that triggered the change. But may programs won't. Let's make an example. A bit contrieved perhaps, but it's just an example. Suppose you have a table of addresses, with the usual columns: an id, a key to another table telling you whose address it is, a street address, a city, a state (let's assume US addresses), a zip code, etc. It's an important table because many applications use it. There will be several stored procedures accessing this table, one of them being "get_address":
        -- Sybase syntax create proc get_address @addressee int as select street_address, city, state, zip from addresses where @addressee = addressee_id -- addressee_id is the foreig +n key
        Now a desire arises to separate the street address into two columns: the name of the street, and the house number. Maybe some application wants to do selects on a street - it doesn't really matter why. The schema can easily be changed, and a conversion program (that reads the table, and splits the street address into a street name and house number) isn't to hard either. Now, if all other programs used the stored procedure to get data out of the table, they don't need to be modified - we just change the stored procedure:
        -- Sybase syntax create proc get_address @addressee int as select house_number + " " + street_name, city, state, zip from addresses where @addressee = addressee_id -- addressee_id is the foreig +n key
        Note that the signature (the input and output formats) of the stored procedure didn't change. Obviously, applications that want to get the street name separately use different stored procedures.
        Trying to hide changes like that behind stored procs and let programs continue as if nothing happened will surely result in slow and complicated joins and the like.
        That all depends on how often queries are run, and when. A bank really doesn't care if its end of the week report that's run on the night from Saturday to Sunday now takes 7 hours and 10 minutes instead of the original 7 hours and 3 minutes. But they'll get mighty upset if one batch job every techy forgot about because it was written long before any of them started working there suddenly fails to work. Don't forget that changing the stored procedures doesn't prevent you from writing new (and faster) stored procedures and calling them from your programs. Stored procedures give you the option to leave programs as is if you modify the schemes.

        I'm in total agreement about your distaste for what passes as "a programming language" inside most DBMs, they are usually very restrictive and clumbsy to use. Now, if MySQL wanted to really steal a march on the rest of the world, they would embed a Perl interpreter--or better still, a Perlish, domain-specific language using Parrot--inside their engines for use in stored procedures. That could really be something.

        Notionally, an SP is just a named method that takes some arguments and (optionally) returns some results. One of the best things about DBI is it's effect of isolating application code from the variations in the specific DBM.

        If DBI defined a standardised routine for calling an SP, binding this set of arguments and returning a statement handle. The statement handle is a cursor that would provide for all the existing ways of retrieving those results. In this way, it would be just as possible to provide a standardised method for calling SPs, as it is for other operations.

        I still don't buy the idea that stored procs can provide protection from schema changes of any significance.

        This is really the same argument as not providing (externally callable) accessors/mutators to attributes in objects. Just as when you write application code to manipulate attributes directly, you effectively undo a good part of the benefit of OO, so accessing tables and columns by name is similarly flawed.

        Just as we shouldn't write code that queries the state of an object and then acts upon that state, instead we should be asking the object to do something for us, so we shouldn't be querying values from the DB, modifying them and then updating the DB to reflect the change. We should be asking the DB to 'add this item to this invoice', not 'querying the price of this item' and then 'adding that much to that invoice'. With the latter method, if the price of the item changes between us querying it, and our updating the invoice, we've made an error.

        Perhaps not a good example as it can be legitimately argued that allowing prices to change on the fly that way could lead to problems of the customer being invoiced a different price to the one he agreed to--though they probably wouldn't complain if it was a reduction:).

        A better example is the stock level changing. If at the point the customer commits to purchasing the items on their pick list, all those items should be added to the invoice, and any items no longer in stock (sold out between the time they added it to their list, and the time they commited to purchasing it), should be detected and ommited from the invoice raised.

        There are all sorts of holes in that example, but the point is that by performing the business function inside thde DB rather than in the application code, you get far fewer race conditions than by embedding the same logic in application code.

        I don't expect that to convince anyone, but coming up with short, accurate examples is hard.

        It's most instructive to try and code objects in multi-tasking environments where the actual value of any state queried from an object can change even before the querying code actually receives the value. In these environments, the only way to achieve synchronisation, is to ask the object to change it's state. All those "OO" designs that are little more than C-style structs with setters and getters rapidly fall in a heap in an multi-tasking environment where objects can be shared.

        This is exactly the same problem, as with DB code that queries and manipulates values, rather than requesting the state-changes to high-level business objects. The state of any value queried can change before an application-modifed value is updated to the DB. Record locking can help, but in the end you set yourself up for a mess of race conditions that are a bugger to sort out.

        IMO the only thing harder than good OO design, is good DB design. You generally have many more restrictions within which to work with the latter.


        Examine what is said, not who speaks.        The end of an era!
        "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
        "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon