arturo has asked for the wisdom of the Perl Monks concerning the following question:

The problem space: I'm creating a web-logging system that records hits to one Apache server to an Oracle database.

There are a number of virtual hosts (47 at last count) and I want to log hits to specific locations on each virtual host.

So it seems I have this kind of hierarchy: Vhost -> location.

Now I've got one script that's going to yank the hits out of the logfile and insert them into the DB. The Vhost and Location objects are going to need access to the DB, but I don't want to create a new db handle for each instance of a location (there are going to be hundreds in all).

Each hit is logged as being to a specific location (file or directory), so each unique location needs a unique identifier. however, locations of the same name may occur on different virtual host, so each location needs to be tied to a particular virtual host. For reporting purposes, all this stuff needs labels (virtual hosts have names, the locations have labels).

so here's one way of writing the script that uses these objects:

use strict; use DBI; my $dbh = DBI->connect(blah blah blah) or die "blah"; use Vhost; use Vhost::Location; my @vhosts; # the constructor takes the ID (the primary key) # of the virtual host, queries the DB # and returns all the # properties of the host (name, etc.) foreach (#list of identifiers from DB) { push (@vhosts, new Vhost($_); } my @locations; foreach my $vhost(@vhosts) { my @location_ids = $vhost->get_location_ids(); foreach @location_ids) { # here, as above, the constructor takes the primary key # of the object in the DB, and returns the other properties push @locations, new Vhost::Location($_); } }
What I'm left with are two arrays: one of location objects, and one of virtual host objects. But clearly, both the Vhost module and the Location module need to access the db. I want to use the same handle all the way through.

Thanks for any input!

Philosophy can be made out of anything -- or less

Replies are listed 'Best First'.
Re: Sharing a database handle among objects
by lhoward (Vicar) on Oct 03, 2000 at 21:29 UTC
    One solution would be to pass your database handle ($dbh) to the Location and Vhost objects when they are created. Those objects would store the database handle internally and use it as needed. Provided your DB doesn't have any problems with executing more than one query at a time on a DB handle (some do) or if you code to avoid that behavior everything should work fine.

    The following is untested, but should be right in spirit....

    my $foo=Vhost->new("foo",$dbh); ....blah blah package Vhost; ..... sub new{ my $class = shift; my $self = {}; bless $self,$class; # set any default values here %{$self} = (); #there are much cleaner ways of doing this $self->{name}=shift; $self->{dbh}=shift; return $self; } sub bar{ my $self=shift; $self->{dbh}->prepare ... }
Re: Sharing a database handle among objects
by chromatic (Archbishop) on Oct 03, 2000 at 21:55 UTC
    I don't currently see the need for separate Vhost::Location objects -- the Vhost objects already have the ability to enumerate the locations they're tracking.

    That indicates to me that there's already some data structure within the Vhost object that knows which locations need to be tracked. Unless there's some very compelling reason that the Locations have to be separate objects, I'd keep them as member data of the Vhost objects.

    My mental image of your database schema is that you have one table for each Vhost, with location, hits, and perhaps modification time columns. If it's normalized, you might have a hosts table with a host_id, and a locations table as described before, but with a host_id which can be used to join on the hosts table.

    In either case, I haven't fully answered your question yet.

    I would definitely pass $dbh to the Vhost constructor. When it comes time to perform a database operation, check the definedness of it and create a new database handle if necessary.

    Unless you're writing a multithreaded program, or are getting database handles from a pool (as one would expect with Apache::DBI), you won't see concurrent access on the same handle and things should work just fine.

      My mental image of your database schema is that you have one table for each Vhost, with location, hits, and perhaps modification time columns. If it's normalized, you might have a hosts table with a host_id, and a locations table as described before, but with a host_id which can be used to join on the hosts table.

      Actually, the schema is one table for vhosts, one for locations, and one for hits.

      The location table has a vhost_id column (as a foreign key) and the hit table has a location ID linked as a foreign key.

      Additionally, although I'm not sure of the wisdom of this move, since locations can nest, I have a 'parent id' linked as a foreign key within the location table (i.e. a location can have a 'parent' ... this way I can keep track of hits to a directory that aren't hits to the particular file).

      All this is supposed to be user-configurable (the various admins of the various virtual hosts can set up which locations they want reporting on)

      If this brings to mind any further ideas about design, etc. I'd love to hear it.

      Thanks to all who've replied so far!

      Philosophy can be made out of anything -- or less

(tye)Re: Sharing a database handle among objects
by tye (Sage) on Oct 03, 2000 at 21:43 UTC

    If Vhost -> Location, then I'd have the Vhost create the Locations so Vhost can decide what to pass to the Locations c'tor. Then do what lhoward says.

    It might be best for each Location to keep a reference to the Vhost. Then the Location could cache the dbhandle, request the dbhandle from the Vhost each time, or submit db requests via the Vhost, depending what makes sense to you.

            - tye (but my friends call me "Tye")
RE: Sharing a database handle among objects
by geektron (Curate) on Oct 03, 2000 at 23:13 UTC
    if you have the ability to run mod_perl, there's another way to do this.

    merlyn published this in Web Techniques. It logs directly to the DB (rather than parsing log files), and has another script to generate statistics. I havne't implemented it anywhere ( my webservers get no hits ), but it looks useful.

      I'd noticed the article you're talking about a while back when I was first planning this system, but the mod_perl's unfortunately out of the question here. The webserver is pretty heavily loaded as things are =)

      Thanks again to all who replied!

      Peevish Update To those -- ing: this is a situation not of my making, nor is it in my power to change. I fully realize the benefits of mod_perl, but the scripts I'm writing are among the few Perl scripts actually running on this server. Mostly it's other kinds of content. The web server i's proper functioning is critical, so I understand the reluctance to add any module as complex as mod_perl to it.

      Philosophy can be made out of anything -- or less

        *boggle*

        Let me get this straight. You are not going to mod_perl, whose very reason for being is to do the same thing as CGI scripts do while generating less load because you are too loaded?

        That very much looks to me like a decision made by someone who does not understand the technology!

Re: Sharing a database handle among objects
by princepawn (Parson) on Oct 03, 2000 at 22:04 UTC
  • DBI::Proxy is part of DBI. It can be configured to return the same connection to a database in response to different connect requests. I found it a bit hard to figure out how to setup.
  • Apache::DBI (which is not in the Database Interfaces part of SEARCH.CPAN.ORG (tsk, tsk, graham)), will also allow different processes to share a persistent DBI database handle.