rsiedl has asked for the wisdom of the Perl Monks concerning the following question:

hi monks,

i'm trying to figure out a way to get some data from a database/table that can cope with the following parameters:
- can handle multiple table names
- can cope with the table structure changing on a regular basis
- outputs the data in a hash of hashes i.e. $kol{$id}{field} = $value

so far i have come up with the following and was wondering if anyone could suggest improvements or other ways to achieve this?

Cheers,
Reagen

Test Code
#!/usr/bin/perl use strict; use warnings; use DBI; use Benchmark qw( timediff timestr ); my $dbh = DB_OPEN('aim6_control','localhost','3306','root','********') +; my $table = "aim6_project_1.4_stats"; my ($sth, $select, $td, $t0, $t1); my %kols = (); print "Start our tests!\n"; # Method 1 print "\n\nMethod 1\n--------\n"; $t0 = new Benchmark; %kols = (); $sth = $dbh->prepare(" desc $table "); $sth->execute(); while ( my ( $NAME, $TYPE, $NULL, $KEY, $DEFAULT, $EXTRA ) = $sth->fet +chrow ) { $select = $dbh->prepare(" select id, $NAME from $table "); $select->execute(); while ( my ( $kol_id, $field_value ) = $select->fetchrow ) { $kols{$kol_id}{$NAME} = $field_value; } # end-while $select->finish; } # end-while $sth->finish; $t1 = new Benchmark; &bm_time( timediff($t1, $t0) ); &print_sample_results(%kols); # Method 2 print "\n\nMethod 2\n--------\n"; $t0 = new Benchmark; %kols = (); my @fields; $sth = $dbh->prepare(" desc $table "); $sth->execute(); while ( my ( $NAME, $TYPE, $NULL, $KEY, $DEFAULT, $EXTRA ) = $sth->fet +chrow ) { push(@fields, $NAME); } # end-while $sth->finish; $select = $dbh->prepare(" select * from $table "); $select->execute(); while ( my ( @values ) = $select->fetchrow_array ) { my $count = 0; foreach my $value (@values) { $kols{$values[0]}{$fields[$count]} = $value; $count++; } # end-foreach } # end-while $select->finish; $t1 = new Benchmark; &bm_time( timediff($t1, $t0) ); &print_sample_results(%kols); exit; sub DB_OPEN { my ($db_name, $host_name, $port, $db_user, $db_pass,) = @_; my $database = "DBI:mysql:$db_name:$host_name:$port"; my $dbh = DBI->connect($database,$db_user,$db_pass); } # end-sub sub bm_time { my ($bm_obj) = @_; print "Benchmark Time: ", sprintf( "%.3f", ( @$bm_obj[1] + @$bm_obj[ +2] ) ), " cpu seconds\n\n"; } # end-sub sub print_sample_results { my (%kols) = @_; my $count = 0; print "Sample Results:\n"; foreach my $kol_id (keys %kols) { print "\tKOL ID:: $kol_id\t\tKOL RANK:: $kols{$kol_id}{rank}\n"; $count++; last if ($count == 6); } # end-foreach } # end-sub
Test Results
[rsiedl@solitare aim6]$ perl test.pl Start our tests! Method 1 -------- Benchmark Time: 0.120 cpu seconds Sample Results: KOL ID:: 32 KOL RANK:: 26 KOL ID:: 33 KOL RANK:: 47 KOL ID:: 21 KOL RANK:: 22 KOL ID:: 7 KOL RANK:: 26 KOL ID:: 26 KOL RANK:: 13 KOL ID:: 2 KOL RANK:: 47 Method 2 -------- Benchmark Time: 0.020 cpu seconds Sample Results: KOL ID:: 32 KOL RANK:: 26 KOL ID:: 33 KOL RANK:: 47 KOL ID:: 21 KOL RANK:: 22 KOL ID:: 7 KOL RANK:: 26 KOL ID:: 26 KOL RANK:: 13 KOL ID:: 2 KOL RANK:: 47

Replies are listed 'Best First'.
Re: Accessing data from a dynamic database/table
by Zaxo (Archbishop) on May 09, 2006 at 06:10 UTC

    Here's another, untested, approach. I'm going to assume that $id, your first-level key, is supposed to be the primary index of the table. A hash implies uniqueness of the keys, otherwise you'd lose data.

    The task is to produce a hash, %kol, which contains a table's primary index values as top-level keys, and a hash of each row's other field/value pairs at the second level.

    That second-level hash will almost fall in our lap if we use fetchrow_hashref() to read the table. There will be no need to hardcode or independently determine the column names.

    First, though, we need to find out what the primary key is. You've assumed it will be id, but lets generalize and make this thing work for any table with a primary index. I'll leave out the connection and all that other stuff, picking up at your # Method n comment.

    # Method Z my @key_columns = $dbh->primary_key( $catalog, $schema, $table);
    Instead of relying on a count while reading the table to limit the number, we'll tell the database to do it for us.Oops, misthunk, lets have no limits.
    # my $limit = 6; my $select = $dbh->prepare("select * from $table");# LIMIT $limit"); $select->execute();
    Now we loop through fetchrow_hashref to get a col/val hash of each row. For each, we'll delete the primary key columns, welding together their values to make a top-level key for %kol ( it may be a little unfamiliar to use delete that way, but it's a handy trick). What's left in the $row hashref is exactly what you want for the second-level hash, so we just plug that reference right in there.
    while (my $row = $select->fetchrow_hashref()) { my $key = join '', delete @{$row}{@key_columns}; $kol{$key} = $row; }
    That's it! You may want to join the key elements using some unlikely character like "\0", just in case you want to extract them from the key later. We didn't need to call finish() on any handles because we didn't leave anything unread.

    I have no idea how that will benchmark. It avoids a good bit of data copying, so it ought to be competitive.

    After Compline,
    Zaxo

      Thanks Zaxo,

      Plugged it in and got the following results:
      Method 3 -------- Benchmark Time: 0.010 cpu seconds Sample Results: KOL ID:: 32 KOL RANK:: 32 KOL ID:: 33 KOL RANK:: 21 KOL ID:: 21 KOL RANK:: 24 KOL ID:: 7 KOL RANK:: 5 KOL ID:: 26 KOL RANK:: 1 KOL ID:: 2 KOL RANK:: 29
      Cheers,
      Reagen

        I see the rank data differs. Has the data changed since the run you showed?

        I'm glad if it works for you. Some of the introspective DBI methods like primary_key() are fairly recent additions to the module.

        After Compline,
        Zaxo