DBI prepare_cached VS self cached

ronzomckelvey has asked for the wisdom of the Perl Monks concerning the following question:

I've been creating this web based application that basically allows you to select data from a database then do lookups and logic on other tables in other databases. The idea is to be able to create export files from any source in whatever format with no coding on the DB side. All external, no changes made to the production boxes, just a read-only account will surfice.

My example would be looking up invoices in Oracle Applications (Oracle DB) and for each invoice lookup the part number off the AS/400 (DB2 DB) then create an export file with the combined data, or push this data into another database. Gotta love Perl and the DBI.

For each row returned from Oracle I do 4 field lookups in Oracle and 3 off the AS/400.

After using the prepare_cached I was disappointed in the overall performance. So I wrote my own caching method, which appears to be what the DBI does (at least by description). I created a %CACHE with the select statement as the Key and a Hash representing the columns returned as the %CACHE value. Before doing the logic select I check my %CACHE first then use it's value if it exists. Thus a cache.

I ran the same code, it was all identical except for changing prepare to prepare_cached, and commenting out my cache routines.

just prepare
1000 recs    3m25s   15m used

with prepare_cached
1000 recs    3m51s   87m used

with prepare & self cache
1000 recs    1m20s   15m used

The just prepare is the baseline with no caching. And after turning on prepare_cached it actually got slower and it used a lot of memory, 87m.
Using my cache it was alot faster and it only used 200k or so of extra memory then the plain prepare.

Am I missing something here with prepare_cached? Has anyone had success with prepare_cached?

The only thing I can even imagine that might screw it up is the fact that I'm opening numerous connections to different databases. My DBI handle is a Hash with the Key being the name of the logic, these logic statements are the field lookups into other database sources.

$Lsth = $Ldbh{"$dologic"}->prepare_cached("$a");

I can live with it the way it is, now that I've build my own caching (since when is $C{$A}=\%{"H".$z} considered building something?)

On the output below, there's 1,113 misses on the cache and 5,886 hits. With all the testing I've done, it's actually faster to use my logics and do seperate cached lookups for non-unique columns then having Oracle ( or the AS/400) do the actual joins to get the column data.

Here's my output using prepare and my cache:

OPS-FSP - Booked Invoice for Material File (FSP)
Making Text file
Select Complete
Records Processed: 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000,

Total Cache Hits:  5,886   @  84.1 %
OPS Buyer Code - hits:     999
OPS ESN/ARE - hits:     950
OPS GL Code - hits:     993
OPS Part Number - hits:     603
OPS UOM - hits:     999
OPS Vendor Delivery Date - hits:     636
Part Desc - hits:     706

Total Cache Miss:  1,113   @  15.9 %
OPS Buyer Code - miss:     1
OPS ESN/ARE - miss:     50
OPS GL Code - miss:     7
OPS Part Number - miss:     397
OPS UOM - miss:     1
OPS Vendor Delivery Date - miss:     363
Part Desc - miss:     294

Cache Size (in bytes)
Size of keys:  143,887
Size of values:  57,803

Click here for your Text File OPS-FSP-10032.txt, it is 501,000 bytes
(Right-Click Save Target As)

Click here for your ZIP File OPS-FSP-10032.zip, it is 36,474 bytes
(Right-Click Save Target As)
Total Records: 1,000,    Total Skipped: 0,   Warnings: 1,   Errors: 0 
+        Elapsed Time: 1 minutes and 20 seconds
[download]

20030928 Edit by Corion: Changed PRE tags to CODE

Comment on DBI prepare_cached VS self cached Select or Download Code

Replies are listed 'Best First'.
Re: DBI prepare_cached VS self cached by perrin (Chancellor) on Sep 27, 2003 at 16:01 UTC
You are not understanding what prepare_cached does. It has nothing to do with caching actual results from queries. Rather it is a cache of statement handles, because the work of creating a statement handle and the corresponsing database work for it is relatively slow with databases like Oracle. I'm surprised that you found it to be slower than normal prepare, since I have always found it to be much faster. I'm guessing that either your hit rate for doing the same query again is very low (you are using bind variables, aren't you?), or you have a bug in your code somewhere. The additional size might have something to do with the AS400 DBD driver. I've never seen that happen with Oracle or MySQL. FYI, when you're curious about what DBI is doing, it's pretty easy to just look at the code for things like this. Here's the complete code for prepare_cached: sub prepare_cached { my ($dbh, $statement, $attr, $allow_active) = @_; # Needs support at dbh level to clear cache before complaining abo +ut # active children. The XS template code does this. Drivers not usi +ng # the template must handle clearing the cache themselves. my $cache = $dbh->FETCH('CachedKids'); $dbh->STORE('CachedKids', $cache = {}) unless $cache; my @attr_keys = ($attr) ? sort keys %$attr : (); my $key = ($attr) ? join("~~", $statement, @attr_keys, @{$attr}{@a +ttr_keys}) : $statement; my $sth = $cache->{$key}; if ($sth) { if ($sth->FETCH('Active') && ($allow_active\|\|0) != 2) { Carp::carp("prepare_cached($statement) statement handle $sth w +as still active") if !$allow_active; $sth->finish; } return $sth; } $sth = $dbh->prepare($statement, $attr); $cache->{$key} = $sth if $sth; return $sth; } [download]	[reply] [d/l]
Re: Re: DBI prepare_cached VS self cached by ronzomckelvey (Acolyte) on Sep 27, 2003 at 19:56 UTC
I guess what had me thinking it acted like a cache was this statement from the docs. "prepare_cached" $sth = $dbh->prepare_cached($statement) $sth = $dbh->prepare_cached($statement, \%attr) $sth = $dbh->prepare_cached($statement, \%attr, $allow_active) Like "prepare" except that the statement handle returned will be stored in a hash associated with the $dbh. If another call is made to "prepare_cached" with the same $statement and %attr values, then the corresponding cached $sth will be returned without contacting the database server.	[reply]
Re: Re: Re: DBI prepare_cached VS self cached by perrin (Chancellor) on Sep 27, 2003 at 20:12 UTC
Right, the cached statement handle will be returned. Creating a statement handle for an Oracle database normally requires contacting the database server to have it prepare (basically compile and determine an execution plan for) the query. Note that it doesn't say anything about $sth->execute() or fetch() being cached.	[reply]
Re: Re: Re: Re: DBI prepare_cached VS self cached by inspire22 (Initiate) on Oct 25, 2003 at 05:07 UTC
Re: Re: Re: Re: Re: DBI prepare_cached VS self cached by perrin (Chancellor) on Oct 25, 2003 at 06:32 UTC