ronzomckelvey has asked for the wisdom of the Perl Monks concerning the following question:

I've been creating this web based application that basically allows you to select data from a database then do lookups and logic on other tables in other databases. The idea is to be able to create export files from any source in whatever format with no coding on the DB side. All external, no changes made to the production boxes, just a read-only account will surfice.

My example would be looking up invoices in Oracle Applications (Oracle DB) and for each invoice lookup the part number off the AS/400 (DB2 DB) then create an export file with the combined data, or push this data into another database. Gotta love Perl and the DBI.

For each row returned from Oracle I do 4 field lookups in Oracle and 3 off the AS/400.

After using the prepare_cached I was disappointed in the overall performance. So I wrote my own caching method, which appears to be what the DBI does (at least by description). I created a %CACHE with the select statement as the Key and a Hash representing the columns returned as the %CACHE value. Before doing the logic select I check my %CACHE first then use it's value if it exists. Thus a cache.

I ran the same code, it was all identical except for changing prepare to prepare_cached, and commenting out my cache routines.

just prepare
1000 recs    3m25s   15m used

with prepare_cached
1000 recs    3m51s   87m used

with prepare & self cache
1000 recs    1m20s   15m used
The just prepare is the baseline with no caching. And after turning on prepare_cached it actually got slower and it used a lot of memory, 87m.
Using my cache it was alot faster and it only used 200k or so of extra memory then the plain prepare.

Am I missing something here with prepare_cached? Has anyone had success with prepare_cached?

The only thing I can even imagine that might screw it up is the fact that I'm opening numerous connections to different databases. My DBI handle is a Hash with the Key being the name of the logic, these logic statements are the field lookups into other database sources.

 $Lsth = $Ldbh{"$dologic"}->prepare_cached("$a");

I can live with it the way it is, now that I've build my own caching (since when is $C{$A}=\%{"H".$z} considered building something?)

On the output below, there's 1,113 misses on the cache and 5,886 hits. With all the testing I've done, it's actually faster to use my logics and do seperate cached lookups for non-unique columns then having Oracle ( or the AS/400) do the actual joins to get the column data.

Here's my output using prepare and my cache:

OPS-FSP - Booked Invoice for Material File (FSP) Making Text file Select Complete Records Processed: 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, Total Cache Hits: 5,886 @ 84.1 % OPS Buyer Code - hits: 999 OPS ESN/ARE - hits: 950 OPS GL Code - hits: 993 OPS Part Number - hits: 603 OPS UOM - hits: 999 OPS Vendor Delivery Date - hits: 636 Part Desc - hits: 706 Total Cache Miss: 1,113 @ 15.9 % OPS Buyer Code - miss: 1 OPS ESN/ARE - miss: 50 OPS GL Code - miss: 7 OPS Part Number - miss: 397 OPS UOM - miss: 1 OPS Vendor Delivery Date - miss: 363 Part Desc - miss: 294 Cache Size (in bytes) Size of keys: 143,887 Size of values: 57,803 Click here for your Text File OPS-FSP-10032.txt, it is 501,000 bytes (Right-Click Save Target As) Click here for your ZIP File OPS-FSP-10032.zip, it is 36,474 bytes (Right-Click Save Target As) Total Records: 1,000, Total Skipped: 0, Warnings: 1, Errors: 0 + Elapsed Time: 1 minutes and 20 seconds

20030928 Edit by Corion: Changed PRE tags to CODE

Replies are listed 'Best First'.
Re: DBI prepare_cached VS self cached
by perrin (Chancellor) on Sep 27, 2003 at 16:01 UTC
    You are not understanding what prepare_cached does. It has nothing to do with caching actual results from queries. Rather it is a cache of statement handles, because the work of creating a statement handle and the corresponsing database work for it is relatively slow with databases like Oracle.

    I'm surprised that you found it to be slower than normal prepare, since I have always found it to be much faster. I'm guessing that either your hit rate for doing the same query again is very low (you are using bind variables, aren't you?), or you have a bug in your code somewhere. The additional size might have something to do with the AS400 DBD driver. I've never seen that happen with Oracle or MySQL.

    FYI, when you're curious about what DBI is doing, it's pretty easy to just look at the code for things like this. Here's the complete code for prepare_cached:

    sub prepare_cached { my ($dbh, $statement, $attr, $allow_active) = @_; # Needs support at dbh level to clear cache before complaining abo +ut # active children. The XS template code does this. Drivers not usi +ng # the template must handle clearing the cache themselves. my $cache = $dbh->FETCH('CachedKids'); $dbh->STORE('CachedKids', $cache = {}) unless $cache; my @attr_keys = ($attr) ? sort keys %$attr : (); my $key = ($attr) ? join("~~", $statement, @attr_keys, @{$attr}{@a +ttr_keys}) : $statement; my $sth = $cache->{$key}; if ($sth) { if ($sth->FETCH('Active') && ($allow_active||0) != 2) { Carp::carp("prepare_cached($statement) statement handle $sth w +as still active") if !$allow_active; $sth->finish; } return $sth; } $sth = $dbh->prepare($statement, $attr); $cache->{$key} = $sth if $sth; return $sth; }
      I guess what had me thinking it acted like a cache was this statement from the docs.

      "prepare_cached"
                   $sth = $dbh->prepare_cached($statement)
                   $sth = $dbh->prepare_cached($statement, \%attr)
                   $sth = $dbh->prepare_cached($statement, \%attr, $allow_active)
      
                 Like "prepare" except that the statement handle returned will be
                 stored in a hash associated with the $dbh. If another call is made
                 to "prepare_cached" with the same $statement and %attr values, then
                 the corresponding cached $sth will be returned without contacting
                 the database server.
      
      
        Right, the cached statement handle will be returned. Creating a statement handle for an Oracle database normally requires contacting the database server to have it prepare (basically compile and determine an execution plan for) the query. Note that it doesn't say anything about $sth->execute() or fetch() being cached.