Category: Utillities
Author/Contact Info Charles Prichard greentv@paulbunyan.net
Description: Extensible set of Table Methods for accessing EZDB compatible, flatfile index files. The module needs lots of work.

Using my own personal web server I have a site wrapper that uses these methods. I have found they are much, much faster than ODBC methods accessing an MS_Access datafile.

The methods are suitable for applications requiring very fast access to small databases.

Latest revision uses strict pragma. Serious upgrade over last version.

-Steeeeeve -
#########################################################
# PRE-RELEASE MODULE
# Module: LADY /TM.pm LADY Table Methods
# Version BETA 1.0 r 3  2-06-2001
# Author: C. Prichard
# COPYRIGHT: GREENTV¬ Charles Prichard -1999-
# This module is a collection of methods needed
# to access and set EZDB compatible hash table data
# stored in a persistent index file on the server.
#
# Using the hashing table indexing system for data allows
# for improvement in efficiency over use of simple flat files
# if the index values are created with sortable features.
# The keys may be extracted from the table and sorted,
# then filtered using the remaining keys to access values
# in the hash table.The methods are practical only for
# small to medium sized datafiles because the entire data
# array is created in the server's memory.
#
# When the table is initialized, a multitasking system will do
# other processes while configuring the data in memory
#
# Systems that benefit are those that intensively require
# special sorting and processing of returned data.
package Lady_TM;

use strict;

#################################################
#  Sub new                                  #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub new ($;$){

    my $class = shift;

    my $self = {};
    
    bless $self, ref $class || $class;
    
    $self->{index_path} = undef;
    $self->{lady_time} = undef;
    $self->{sep} = undef;
    
    $self->{TABLE} = {};
    $self->{TABLE_ARRAY} = [];
    
    $self->init(@_);
    
    
    return $self;

}
sub init($;$){
    
    my $self = shift;

    $self->{index_path} = shift;
    
    if ($_[0]){$self->{sep}= shift;}else{$self->{sep}="Ö";}
    
    use Lady::Lady_DATE;

    my $TIME = new Lady_DATE;

    $self->{lady_time} = $TIME->lady_time();

return;
}
#################################################
#       Sub set_specified_value_in_index_table  #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub set_specified_value_in_index_table ($$){

    my $self = shift;

        my ($index_key,$index_value) = @_;

        if (defined $self->{index_path}) {

            $self->get_index_table();

            $self->{TABLE}{$_[0]} = $_[1];
            $self->update_index_table ($self->{TABLE});
        }

        return;
}
#################################################
#       Sub increment_specified_value_in_index_table  #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub increment_specified_value_in_table ($){

    my $self = shift;

    my ($key) = @_;

    $self->initialize_table();

    $self->{TABLE}{$key}++;

    $self->update_index_table($self->{TABLE});

    return $self->{TABLE}{$key};

}
#################################################
#       Sub decrement_specified_value_in_index_table  #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub decrement_specified_value_in_table ($){

    my $self = shift;

    my ($key) = @_;

    $self->initialize_table();

    $self->{TABLE}{$key}--;

    $self->update_index_table($self->{TABLE});

    return $self->{TABLE}{$key};

}
#################################################
#       Sub add_specified_amount_to_value_in_table #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub add_specified_amount_to_value_in_table ($$){

    my $self = shift;

    my ($key,$value) = @_;

    $self->initialize_table();

        $self->{TABLE}{$key} = $self->{TABLE}{$key} + $value;

    $self->update_index_table($self->{TABLE});

    return $self->{TABLE}{$key};

}
#################################################
#       Sub get_specified_value_in_index_table  #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_specified_value_in_table ($){

    my $self = shift;

        my $key = shift;

        $self->get_index_table();

        return $self->{TABLE}{$key};

}
#################################################
#       Sub get_random_option                   #
#################################################
#################################################
#       PRIVATE Subroutine                      #
#################################################
sub random_option ($){

    my $self = shift;

        return int(rand($_[0]));
}
#################################################
#       Sub get_random_value_in_index_table     #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_random_value_in_index_table () {

    my $self = shift;

    my ($lines,$random_selection,$random_line);

        $self->get_index_table ();
        
        my (%TEMP) = $self->{TABLE};

        my $self->{TABLE_ARRAY} = values (%TEMP);
        $lines = scalar ($self->{TABLE_ARRAY}) - 1;
        $random_selection= $self->random_option ($lines);
        $random_line = $self->{TABLE_ARRAY}[$random_selection];

        return ($random_line);
}
#################################################
#       Sub get_index_table                 #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_index_table (){

      my $self = shift;

      # flock (INDEXFILE, $LOCK_SH); #Commented for Personal Webserver
+ Perl implementation.
      open(INDEXFILE, "$self->{index_path}") || &file_open_error("$sel
+f->{index_path}","Read Index",__FILE__,__LINE__);

      $_ = <INDEXFILE>; #READ first line.

      if ($_ =~ /(.+)/){$_ = $1;}
      else {die "Bad data in index $self->{index_path}.";}
     # print "$_<br>";
    # my @db_definition = split(/\|/,$_);

      $self->{TABLE} = ();

      while (!eof(INDEXFILE)){

              $_ = <INDEXFILE>;

              if ($_ =~ /(.+)/){$_ = $1;}
              else {die "Bad data in index $self->{index_path}.";}

              my @db_row = split(/\|/,$_);

              $self->{TABLE}{$db_row[1]} = $db_row[2];

         # print  "$db_row[1],$db_row[2]<br>";
    }

      close(INDEXFILE);

      return $self->{TABLE};
} #end of get_index_table subroutine
#################################################
#        Sub update_index_table                 #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub update_index_table (%){

      my $self = shift;

      $self->{TABLE} = @_;

      #flock (INDEXFILE, $LOCK_SH);
      open (INDEXFILE, "<$self->{index_path}") || &file_open_error("$s
+elf->{index_path}","Reading Index",__FILE__,__LINE__);

      my $newline = <INDEXFILE>;

      if ($newline =~ /(.+)/){$newline = $1;} # passes all chars
      else {die "Bad data in index $self->{index_path}.";}
      # $newline = $newline;

      $newline .= "\n";

      my @db_row = ();

      while (<INDEXFILE>){

              if ($_ =~ /(.+)/){$_ = $1;}
              else {die "Bad data in index $self->{index_path}.";}

              @db_row = split(/\|/,$_);

              # print "@db_row"."<br>";

              if (defined $self->{TABLE}{$db_row[1]}){

                  $db_row[2] = $self->{TABLE}{$db_row[1]};
                  $newline .= join "|",@db_row;
                  $newline .= "\n";

                  # print "<br>$newline";

        }

      }

      close INDEXFILE;

      # flock (INDEXFILE, $LOCK_EX);
      open (INDEXFILE, ">$self->{index_path}") || &file_open_error(">$
+self->{index_path}","Writing Index",__FILE__,__LINE__);
      # if ($newline =~ /(.+)/){$newline = $1;}
      # else {die "Bad data in index $self->{index_path}.";}
      syswrite INDEXFILE, $newline, length $newline;
      close INDEXFILE;

      return;

} #end of update_index_counters subroutine
#################################################
#       Sub get_keyed_value_array                #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
#
# There is a problem here returning an array, splitting on sep
#
sub get_keyed_value_array ($){

    my $self = shift;

      $self->get_index_table();
      
      #print "STRING:$self->{TABLE}{$_[0]}<br>";
      
      #print "SEP:\"$self->{sep}\"<br>";
      
      my (@ARRAY) = split ("$self->{sep}",$self->{TABLE}{$_[0]});
      
      #print "ARRAY:$ARRAY[0],$ARRAY[1],$ARRAY[2]<br>";
      
      return (\@ARRAY);

}

#################################################
#       Sub set_keyed_value_array                #
#################################################
sub set_keyed_value_array ($@){

    my $self = shift;

      my ($key) = shift;
      
      $self->{TABLE_ARRAY} = shift;
      
      #print "KEY:$key, TABLE: $self->{TABLE_ARRAY}[0]<br>";
      
      $self->get_index_table();

      if (defined $self->{TABLE}{$key}){

          $self->{TABLE}{$key} = join "$self->{sep}",$self->{TABLE_ARR
+AY};
          $self->update_index_table($self->{TABLE});
      }
      else{
          $self->add_new_table_row($key,$self->{TABLE_ARRAY});
      }

      return;
}
#################################################
#     Sub get_nth_element_in_keyed_array      #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_nth_element_in_keyed_value_array($$){
    
    my $self = shift;
    
    my ($key,$place) = @_;
    
    my @List = @{$self->get_keyed_value_array($key)};
    return $List[$place];

}
#################################################
#       Sub delete_key_value_pair                #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub delete_key_value_pair ($%){

    my $self = shift;

    my ($key) = shift;
    
    $self->{TABLE} = shift;

    my %deleted_pair = ();

      $deleted_pair{$key} = $self->{TABLE}{$key};

      delete ($self->{TABLE}{$key});

      $self->update_index_table($self->{TABLE});

      return (\%deleted_pair);
}
#####################################################
#     Sub get_table_keys_by_specified_first_characters  #
#####################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_table_keys_by_specified_first_characters($%){

    my $self = shift;

    my ($char) = shift;
    
    my (%TEMP) = shift;

    my (@temp) = grep{[/($_[0]+)/,$_]} keys (%TEMP);

    return (\@temp);
}
#################################################
#     Sub get_numeric_keys_greater_than_passed_value#
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_numeric_keys_greater_than_passed_value($;%){

    my $self = shift;

    my ($val) = shift;
    
    my (%TEMP) = shift;
    
    my (@temp) = grep{($_ > $val)} keys (%TEMP);

    return (\@temp);
}
#################################################
#     Sub get_numeric_keys_less_than_passed_value      #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_numeric_keys_less_than_passed_value($;%){

    my $self = shift;
    
    my ($val) = shift;
    
    my (%TEMP) = shift;

    my (@temp) = grep{($_ < $val)} keys (%TEMP);

    return (\@temp);
}
#################################################
#     Sub get_numeric_keys_in_passed_range      #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub get_numeric_keys_in_passed_range($$%){

    my $self = shift;

    my $lower = shift;
    
    my $upper = shift;
    
    my (%TEMP) = shift;
    
    my (@temp) = grep{[($_ ge $lower) && ($_ le $upper)]} keys (%TEMP)
+;

    return (\@temp);
}
#################################################
#       Sub initialize_table                     #
#################################################
#################################################
#       PRIVATE Subroutine                       #
#################################################
sub initialize_table (){
    
    my $self = shift;

      return $self->get_index_table();

}
#################################################
#    Sub create_table                   #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub create_table($){

    my $self = shift;

    my ($table_line) = @_;

    #TODO: Test for existence of table and cancel operation if test fa
+ils.

      open(INDEXFILE, ">$self->{index_path}") || &file_open_error("$se
+lf->{index_path}","Creating Table",__FILE__,__LINE__);
      print INDEXFILE "$table_line";
      close INDEXFILE;
}
#################################################
#    Sub add_new_table_row               #
#################################################
#################################################
#       PUBLIC Subroutine                       #
#################################################
sub add_new_table_row ($@){

    my $self = shift;

    my $key = shift;
    
    $self->{TABLE_ARRAY} = shift;
    
    #print "TABLE_ROW: $self->{TABLE_ARRAY}[0]<br>";

      #flock (INDEXFILE, $LOCK_SH);
      open (INDEXFILE, "<$self->{index_path}") || &file_open_error("$s
+elf->{index_path}","Reading Index",__FILE__,__LINE__);

      my $newline = <INDEXFILE>;

      if ($newline =~ /(.+)/){$newline = $1;} # passes all chars
      else {die "Bad data in index $self->{index_path}.";}

      my @row = ();

      $newline .= "\n";

      my $i=90000;
      my $line;
      while (<INDEXFILE>){

              if ($_ =~ /(.+)/){$_ = $1;}
              else {die "Bad data in index $self->{index_path}.";}
              my @db_row = ();
              @db_row = split(/\|/,$_);
              if ($i == 90000){@row = @db_row;}
              my @values = split(/\ /,$db_row[2]);
              #if ($lady_time < $values[3]){ #TO add something for pur
+ging
                  $db_row[0] = $i;
                  $line = join "|",@db_row;
                  $newline .= $line."\n";
                  # print "<br>$newline";
              $i++;
          #}
      }

      close INDEXFILE;

      $row[0] = $i;
      $row[1] = $key;
      $row[2] = join "$self->{sep}",$self->{TABLE_ARRAY};

      $newline .= join "|",@row;
      $newline .= "\n";

      # flock (INDEXFILE, $LOCK_EX);
      open (INDEXFILE, ">$self->{index_path}") || &file_open_error(">$
+self->{index_path}","Writing Index",__FILE__,__LINE__);
      # if ($newline =~ /(.+)/){$newline = $1;}
      # else {die "Bad data in index $self->{index_path}.";}
      syswrite INDEXFILE, $newline, length $newline;
      close INDEXFILE;

      return;

}#end of add_new_table_row subroutine

#################################################
#    Sub get_and_increment_page_counter    #
#################################################
# This method reads an EZ DB compatible flatfile and indexes a counter
# value using the passed page tag in $counter. When a matching tag is 
+found
# the associated value is incremented and concatenated to the new file
+ string.
# The index tags and counter values are placed in the %index_table arr
+ay 
# and when the method returns, it indexes the tag counter value and pl
+aces it
# in the returned value.
# If a matching tag value in counter is not found, then a new line is 
# automatically appended to the counter file using the tag, the accrue
+d row 
# count and the optional passed html page path value.
#
# EXAMPLE: (From input parsing method.)
# if ($page){
#        $page_name = $page;
#        $page = &get_page($page);    # TO DO: ADD OTHER PARSING METHO
+DS HERE            
#            $DISPLAY_TABLE{'page_counter'} = &get_and_increment_page_
+counter($page_name,$page);
#            
#            &display_page("$page","$page_name", __FILE__,__LINE__,\%D
+ISPLAY_TABLE);
#            exit 0;
#    } 
#
# Where:The display_page method will search for a %%page_counter%% tag
+ and if found, make the 
# replacement.(NOT SHOWN)
#
# The $lady_site_page_counter_path variable must be initialized proper
+ly.to
# locate the counter file.
# The file is a common flatfile with the following structure:
# pkey|page|count|path||blank
# 1000|opener|0|./html/NTTC/NTTC_opener.html||
# 1001|general_information|0|./html/NTTC/general_information.html||
# 1002|contacts|0|./html/NTTC/contacts.html||
# 1003|links|0|./html/NTTC/links.html||
# 1004|index|0|./html/NTTC/NTTC_index.html||
#
# The path is not used by the counter, but future modifications to the
+ script
# could consolidate the page path index and the counter index to impro
+ve 
# performance.

sub get_and_increment_page_counter($;$){

    my $self = shift; 
    
    my $counter = shift;
    
    my $path = shift;
    
    $self->{TABLE} = ();
        
    my $found = 0;
    open(INDEXFILE, "<$self->{index_path}") || &file_open_error($self-
+>{index_path},"Reading Counter Index",__FILE__,__LINE__);
    # flock INDEXFILE, 2;     # SET EXCLUSIVE MODE
      my $line = <INDEXFILE>;
      my $newline = $line;
      my $x=999;            # Initial Value - 1
      my @db_row = ();
      
    while (!eof(INDEXFILE)){
        
              $line = <INDEXFILE>;        
              @db_row = split(/\|/,$line); 
              if($counter eq $db_row[1]){$db_row[2]++;$found=1}
              $self->{TABLE}{$db_row[1]} = $db_row[2];
            my $nextline = join "|",@db_row;
              $newline .= $nextline;
              $x++;
        } 
        if ($found == 0){
            $x++;
            my $newstring = $x."|".$counter."|1|".$path."||\r\n";
            $newline .= $newstring;
            $self->{TABLE}{$counter} = 1;
        }
    close INDEXFILE;
    open (INDEXFILE, ">$self->{index_path}") || &file_open_error("$sel
+f->{index_path}","Writing Counter Index",__FILE__,__LINE__);
    # flock INDEXFILE, 2;     # SET EXCLUSIVE MODE
    print INDEXFILE $newline;
    close INDEXFILE;
        
    return $self->{TABLE}{$counter};
}

sub file_open_error{

                # The subroutine simply uses the update_error_log
                # subroutine discussed later to modify the error log a
+nd
                # then uses CgiDie in cgi-lib.pl to gracefully exit th
+e
                # application with a useful debugging error message se
+nt
                # to the browser window.

  my ($bad_file, $script_section, $this_file, $line_number) = @_;

  # &update_error_log("FILE OPEN ERROR-$bad_file", $this_file, $line_n
+umber);
  use CGI_lib;

  my $CGI = new CGI_lib();

  $CGI->CgiDie ("I am sorry, but I was not able to access $bad_file in
+ the
        $script_section routine of $this_file at line number $line_num
+ber.
        Would you please make sure the path is correctly defined in
        lady_site.setup and that the permissions are correct. $!")
  } # End of Sub "file_open_error"


1;
Replies are listed 'Best First'.
Re: Lady_TM
by chipmunk (Parson) on Feb 02, 2001 at 00:10 UTC
    You mention that the code needs work. Here are some things I see that can be improved.
     

    The prototypes on all of your subroutines conflict with the code in the subroutines. For example: sub new ($){ says that new takes a single argument, but then two arguments are shifted from @_:

    my $self = shift; $self = bless {}; $self{'index_path'} = shift;
    It is fortunate in this case that current versions of Perl ignore prototypes on method calls. :)
     

    Your hardcoded LOCK constants are not platform compatible:

    $LOCK_SH = '1'; $LOCK_EX = '2'; $LOCK_NB = '3'; $LOCK_UN = '4';
    The constants from Fcntl should be used instead.
     

    Globs, which are entries in the symbol table, do not include lexical variables, which are held outside the symbol table. In this code:

    my %deleted_pair = (); $deleted_pair{$key} = $TABLE{$key}; delete ($TABLE{$key}); $self->update_index_table(\%TABLE); return (*deleted_pair)
    the glob that is returned does not include the lexical hash that was just made. You probably want to return the hash itself, or a reference to it.
     

    These are a few of the more egregious issues with the current state of your module. I hope you will continue making improvements.

      Thank you.

      Yes. There are many things to work on.

      Since comparing efficiency with that of a MS_Access session, I realize that Lady /TM approach to data access has merit. MS_Access uses something like a 1MB driver with an ODBC connection.

      I think a module to make it easy to create templates to be used with Lady /TM may be worth taking a look at as well. The templates would serve the same purpose as with EZDB allowing data management via a WWW connection.

      -Steeeeeve
      2-3-01 Fixed all globs to references wherever they occurred. Changed first shift in each method to use "$class" named variable. Added "get_nth_value... method.

      I am in the habit od using the line:

      my $self = shift;

      in each method that is to be accessed externally. Keeping a method private is sometimes desireable, those methods do not include the line. I took this from the good book "Perl from the Ground UP."

      In experience, if I forget to insert this line, the method IS unavailable. I do it now out of habit.

      Looking in the Perl Cookbook I see that this first shift is used to associate the method as a member of the "class object."

      This is not something that has anything to do with the method's prototype. The prototype will never take into account that the class association is the first shifted item in object-oriented Perl class method.

      Its something I don't understand fully, but I do it.

      -Steeeeeve

        I have never heard of "Perl from the Ground UP" so I cannot tell you whether or not it is bad. But be warned that there are lots of bad books out there, and I have probably heard of most of the worthwhile ones...

        Anyways using features when you don't understand what they are supposed to be doing is generally a bad idea. That leads to cargo-cult programming, a subject which leads to rants around here from time to time.

        In particular the only way you are getting away with your prototypes is that Perl is completely and absolutely ignoring them. Which means that if you ever try to write regular procedural code you will be hopelessly confused. For a full explanation try this rant.

        Beyond that, try putting:

        use strict;
        at the top of your module and seeing how much it complains. While you are at it do not write to globals in packages outside of yours without permission. (Yes, I am talking about $::Lady_TM.) It looks like you have not yet learned about pod. The (commented out) locking code makes a number of mistakes. Starting with trying to lock the filehandle before trying to open the file. That simply cannot work (though you don't test for the error so you wouldn't notice).

        Also for hashing you might want to look at existing modules. In particular the widely-used DB_File.

        A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Lady_TM
by Steeeeeve (Initiate) on Feb 02, 2001 at 10:24 UTC
    I added a method to the code that is much preferable for updating page counters. It reads the table in one pass looking for the counter, if it doesn't find it, one is added to the index for that page (presumeably a new page.)

    Steeeeeve

Re: Lady_TM
by Steeeeeve (Initiate) on Feb 02, 2001 at 13:09 UTC

    How about adding this method?

    ################################################# # Sub get_nth_element_in_keyed_array # ################################################# ################################################# # PUBLIC Subroutine # ################################################# sub get_nth_element_in_keyed_value_array($$;*){ my $self = shift; local ($key,$place,*TABLE) = @_; my @List = @{$self->get_keyed_value_array($key)} my $i = 0; my ($value) = grep {!(++$i % $place)} @List; return $value; }

    I say yes. (Assuming the method works properly.)

    -Steeeeeve

      ################################################# # Sub get_nth_element_in_keyed_array # ################################################# ################################################# # PUBLIC Subroutine # ################################################# sub get_nth_element_in_keyed_value_array($$){ my $class = shift; local ($key,$place) = @_; my @List = @{$self->get_keyed_value_array($key)}; return $List[$place]; }
      This code is much better. In rare cases it will be nice to get a single element without the entire row.