CPAN hopeful: need your input on DBIx::TableHash

Hello,

Over the past few days, I have prepared a small module for use in scripts that require ‘saving’ table/view (Oracle data views in particular) in memory for quicker access at run-time and reduced load on the database.

Initially, I was reluctant to discuss this module on PerlMonks as I wasn’t certain as to whether it could find use amongst other Perl programmers. For one I know it helped improve a number of my scripts considerably. Aside from aforementioned benefits, this module is sometimes cleaner substitute for direct DBI calls dispersed throughout your code. In one of my scripts, I simply aggregated every SQL statement inside a hash in a config file and used them later to initialize my DBIx::TableHash objects. Later in the code, I simply invoked the get_hashref() method of my DBIx::TableHash object(s) to retrieve data matching certain key values.

I should point out that alternative key mapping is another distinctive feature of this module. At my work, I have to deal with data coming from various vendors (feed providers etc – as most of you in the web development field would know) and more often than not I would find a few key values (record indexes.. e.g. such as city code to indentify a city record) that are different for every vendor. When you are talking of only a dozen (or fewer) distinct key values, it’s more efficient to use this module’s add_key_mapping() method to bridge the two key values (the one you have in your database with that provided by your feed supplier). Certainly, if you have a larger set of differing keys, I guess you’d be rather looking at implementing special reference tables; however, this is left for you to deal with on your own merit ;).

Ultimately, my question to you is this: do you see this module being useful enough in your work (or that of your peers/friends)? In case of a positive response, would you give your blessing for this module to be included in the CPAN? Again, my sincere concern is to help out other perl developers in dire need of similar capabilities such as offered through this module. I felt terrible not to share it with the community ;-).

Also, I would appreciate it if you could suggest any useful module building utilities if you know of any ;).

Note: please, excuse me for not having finished writing POD for this module. ;/. I’m working on it in the mean time.

Module code:

package DBIx::TableHash;

# AUTHOR: Vladimir Bogdanov
#
# $Date: 2002/04/24 03:38:21 $
#
# $Id: TableHash.pm,v 1.11 2002/04/24 03:38:21 vlad Exp $         
#
# $Revision: 1.11 $  
#
# $Log: TableHash.pm,v $
# Revision 1.11  2002/04/24 03:38:21  vlad
# *** empty log message ***
#
# Revision 1.10  2002/04/23 20:37:32  vlad
#
# fixed serious flaw in get_hashref whereby hash keys would not be ass
+igned
# correct values.
#
# Revision 1.9  2002/04/22 22:38:03  vlad
#
# release
#
# Revision 1.8  2002/04/22 20:00:12  vlad
#
# with add_key_mapping() existing keys may be mapped to existing recor
+ds
# via additional/alternative key values.  For example, in my weather
# city analyzer script, I had to add this line:
#
# $lookup{countryname_code}->add_key_mapping(countryname => 'England',
+ 'UNITED KINGDOM');
#
# to also make 'UNITED KINGDOM' (country name used by our vendor) poin
+t to
# all records that 'England' (our name for the same country) key point
+s to.
#
# Revision 1.7  2002/04/22 19:40:57  vlad
#
# new subroutine add_record() may now be used to add a record to the t
+able.
#
# Revision 1.6  2002/04/22 19:17:21  vlad
#
# added get_hashref to return records as hashes with keys being column
# names and values.. err.. their respective values ;).  Also, it is
# now possible to look for multiple key values (not multiple keys just
# yet).
#
# Revision 1.5  2002/04/19 22:28:34  vlad
#
# minor
#
# Revision 1.4  2002/04/19 22:13:09  vlad
#
# First working release.
#
# Revision 1.3  2002/04/19 20:51:21  vlad
#
# fixed a few bugs. Rearranged code flow.
#
# Revision 1.2  2002/04/19 20:45:56  vlad
#
# beta.
#
# Revision 1.1  2002/04/19 17:46:52  vlad
#
# a simple module to store table data into a hash (for quicker access)
+.
# The hash also could be dumped into a static file (using Data::Dumper
+)
# for use on consequent runs of the same script.
#
#
# DESCRIPTION: 
#  somewhat similar to DBIx::Cache but is very simpler and
#  serves narrower purpose.  
#
# TODO: 
# 0. Enable lookup by multiple keys so that only records
#    containing both matching keys will get returned.
#    Also, could implement support for complex look up
#    rules (near to what you'd get with SQL WHERE clause).
#
# 1. Add set(field => value) method to allow user to set
#    a record field to a new value.
#
# 2. Add commit() ? to save data back into the database.
#    Note: may have to deal with original SQL statement
#          in odrer to build a proper UPDATE SQL command.
#

use strict;

#
# Instantiates an object that will store database data
# from a single table (or multiple for that matter, depending
# on the kind of SQL statement used to grab that data)
# in an internal specially arranged structure to facilitate
# quick key value lookup machanism.
# 
# example:
# statement = "select col1, col2, col3 from table foobar"
# fields = qw(col1 col2 col3)
# keys = qw(col2)
#
# if return data is:
# 
# col1_val1, col2_val1, col3_val1 # record 1
# col1_val2, col2_val2, col3_val2 # record 2
# col1_val3, col2_val3, col3_val3 # record 3
#
# Data will be structured as follows:
# %table = 
# (
#  fields => {
#            'col1' => 0,
#            'col2' => 1,
#            'col3' => 2,
#            }
#  records => [
#              ['col1_val1','col2_val1','col3_val1'], # record 1 
#              ['col1_val2','col2_val2','col3_val2'], # record 2
#              ['col1_val3','col2_val3','col3_val3'], # record 3
#             ]
#  record_keys => {
#           # col1 serves as key
#           col1 => {
#                    # key field value => list of matching records
#                    col2_val1 => [0],
#                    col2_val2 => [1],
#                    col2_val3 => [2],
#                   },
#  } 
# )
#
# So, to find a record by a value of col1, you'd have to do this:
#                     name of a key field --\
# $table{records}[$table_data{record_keys}{col1}{'col2_val2'}][$table_
+data{fields}{col3}]
#
# Which is equivalent to this SQL:
#
# select col3 from table foobar where col1 = 'col2_val2';
#
sub new {
  my $pkg = shift;
  my $self; { my %hash; $self = bless(\%hash, $pkg); }
  
  my (%vars) = @_;
  
  my $ar_keys;
  if (exists $vars{keys}) {
    if (ref $vars{keys} eq "ARRAY") {
      @{$ar_keys} = map {uc($_)} @{$vars{keys}};
    } 
    elsif (ref $vars{keys} eq "SCALAR") {
      $ar_keys = $vars{keys};
    }
  }

  my $statement= $vars{statement};
  my $params_aref = $vars{params};

  my $dbh = $vars{dbh};
  my $sth = $dbh->prepare($statement);
  $sth->execute(@$params_aref) or die $sth->errstr;

  # key field(s) (will allow easy hash key lookup).
  # use the first field by default.
  my $ar_fields = $sth->{NAME};

  $ar_keys ||= [$ar_fields->[0]]; # first field as key by default.

  my $i = 0;
  %{$self->{table}{fields}} = map {$_ => $i++} @$ar_fields;  

  while (my @row = $sth->fetchrow_array()) {
      $self->add_record($ar_keys, [@row]);        
  }
  
  return $self;
}

#
# add a record
#
# $rec : reference to record array (all field values)
#        Care should be taken to make sure that
#        field values are arranged in proper order
#        here to match the original order in SQL
#        that was used to build data view (table data).
#
sub add_record {
    my ($self, $ar_keys, $ar_rec) = @_;
    
    # store in records hash
    push @{$self->{table}{records}}, $ar_rec;
    my $record_indx = scalar(@{$self->{table}{records}}) - 1;

    foreach my $key_field (@$ar_keys) {
    my $key_val = $ar_rec->[$self->{table}{fields}{$key_field}];
    # create key field mapping 
    push @{$self->{table}{record_keys}{$key_field}{$key_val}}, $record
+_indx;
    }
}

#
# maps given key to a record.
# Usually done to add new keys that would
# link to existing records.
#
# $key_field : key field name to map
# $key_value : new key value
# $map_to_value : existing key value
#
# returns: undef if mapping failed (e.g. no record to map to).
sub add_key_mapping {
    my ($self, $key_field, $key_value, $map_to_value) = @_;

    $key_field = uc($key_field);

#    $DB::single = 1;
    # retrieve matched record indexes
    my $rec_indxs = $self->_find_record_indxs($key_field, $key_value);
+    

    return unless ($rec_indxs);

    for my $rec_indx (@$rec_indxs) {
    # add mapping..
    push @{$self->{table}{record_keys}{$key_field}{$map_to_value}}, $r
+ec_indx;
    }
}

#
# returns list of records in the table that
# matched key value. Each record is represented
# by an array of values.
# 
# Note: actual records are not being copied here.
#       Therefore, if user chooses to update
#       a record field value, he/she will be 
#       modifying a record field value stored
#       in this object's table.
#
sub get {
  my ($self, $key_field, $key_value) = @_;
  my $rec_num = $self->_find_record_indxs($key_field, $key_value);
  return ($rec_num) ? @{$self->{table}{records}}[@$rec_num] : ();
}

#
# get list of references to matched records represented
# as hashes. 
#
sub get_hashref {
  my ($self, $key_field, $key_value) = @_;
  my $rec_num = $self->_find_record_indxs($key_field, $key_value);

  return unless ($rec_num);

  # note: $self->{table}{records} is array ref therefore,
  # @{$self->{table}{records}}[@$rec_num] returns one or
  # more of such array refs (say, if 1 key matched a few 
  # records). 
  my @records_arrayrefs = @{$self->{table}{records}}[@$rec_num];

  my @found_records;
  for my $rec (@records_arrayrefs) {
      # build table record hash (keys -> column names)
      my %rec_hash;
      $rec_hash{$_} = $rec->[$self->{table}{fields}{$_}] for (keys %{$
+self->{table}{fields
+}});
      push(@found_records, \%rec_hash);
  }

  # return reference to the new hash struct.
  return \@found_records;
}

# returns reference to list of 
# record indexes which contain the key(s)
# $key_value may be an array ref containing
# possible keys to compare against.
sub _find_record_indxs {
  my ($self, $key_field, $key_value) = @_;

  # return if either of the two required parameters are
  # not specified.
  return unless (defined $key_field && defined $key_value);

  my $table = $self->{table};    
  $key_field = uc($key_field);
  return unless exists $table->{record_keys}{$key_field};

  my $keys = (ref $key_value eq "ARRAY") ? $key_value : [$key_value];

  my $rec_num;  
  foreach (@$keys) {
      if (exists $table->{record_keys}{$key_field}{$_}) {
      return $table->{record_keys}{$key_field}{$_}; # ok, found!
      }      
  }
  
  return; # return undef (ak'a 'empty array' or undef based on context
+)
}

1;

=head1 NAME

DBIx::TableHash - Perl module to build database table view hash for qu
+ick lookup later in your code.

=head1 SYNOPSIS

1. Create a DBIx::TableHash object:

my $country_table = new DBIx::TableHash(
                    dbh => $dbh,
                                        statement => qq/ select countr
+ycode, countryname f
+rom countries /,
                    );

2. Lookup records with matching 'countrycode' field:

my $country_rec = $country_table->get_hashref(countrycode => 'USA');
print "Country name: " . $country_rec->[0]{COUNTRYNAME} . "\n";


Similarly, you may create TableHash objects to allow you lookup record
+s by multiple keys.
Here's an example of just how you could do this:

1. Again, create a DBIx::TableHash object, but a little more complex t
+han the one before:

#
# Note:  '. . .' denotes SQL expression of any complexity you wish.
#
my $country_table = new DBIx::TableHash(
                    dbh => $dbh,
                                        statement => qq/ select provna
+me, provcode, countryname, countrycode
                                                         from . . .
                                                         where . . . /
+,
                                        keys => [qw(provcode countryco
+de)], # lookup keys
                    );

2.a Lookup records with matching provcode (Province code):

my $prov_rec = $country_table->get_hashref(provcode => 'BC');
print "First province name: " . $prov_rec->[0]{PROVNAME} . "\n";

2.b Find all provinces (or states) that belong to specified country:

my $prov_rec = $country_table->get_hashref(countrycode => 'USA');

$prov_rec is an AoH (array of hashes) structure, where each hash repre
+sents a record
in the database table/view.  This is very similar to what you’d get wi
+th a conventional
DBI select_hashref call.
[download]

"There is no system but GNU, and Linux is one of its kernels." -- Confession of Faith

Comment on CPAN hopeful: need your input on DBIx::TableHash Select or Download Code