rdfield has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,
I'm hoping that someone will be able to cast an eye over my datastructure caching method detailed below.

To set the scene: I have a large database (several dozen million rows) and a web (Apache/mod_perl) based application that allows the user to define and select their own views of the data. Some operations take several minutes to complete to creation of the data structure that underlies the view. Some changes to the view do not require that the data structure be rebuilt, just parsed in a different way to produce to user selected view. In this case I thought that using one of the serialisation modules would help - and they did until the number of rows returned from the database exceeded about 20000, which would have been OK, but the result sets could involve several hundred thousand rows. After 20000 rows it becomes quicker to rebuild the data structure from scratch.

After a bit of head scratching I tried to use SOAP::Lite for the job, but again became bogged down by the fact the the data of the object is serialised and sent to the client. Hence the kludge I have been working on. The following code actually works (i.e. the data is retained across different runs of the client process, simulating multiple web requests), but (and here's the question...) it all looks a bit, well, inelegant - does any monk with more experience of object caching have any ideas/observations that they're willing to share?

The code:
The Demo package: #!perl -w package Demo; use strict; my @handle; sub new { my $class = shift; my $id = shift; $handle[$id] = bless {string => shift},$class; print "new $handle[$id]->{string}"; return $id; } sub string{ my $self = shift; my $id = shift; print "id = $id, handle string = $handle[$id]->{string}\n"; if (@_) { $handle[$id]->{string} = shift; } return $handle[$id]->{string}; } sub get{ my $self=shift; my $id = shift; if ($handle[$id] == undef) { print "not defined!"; $self->new($id,"undefined"); } return $id; } 1; The server: #!perl -w use SOAP::Transport::HTTP; use Demo; use strict; my $daemon = SOAP::Transport::HTTP::Daemon -> new (LocalPort => 90) -> dispatch_to('Demo') ; print "Contact to SOAP server at ", $daemon->url, "\n"; $daemon->handle; The first client: #!perl -w use strict; use SOAP::Lite +autodispatch => uri => 'Demo', proxy => ('http://localhost:90/soap/server.pl'); my $demo = Demo->new(1,"test1a"); print Demo->string(1) . "\n"; Demo->string(1,"test1b"); print Demo->string(1) . "\n"; and the second client that demonstates that the data has benn cached: #!perl -w use strict; use SOAP::Lite +autodispatch => uri => 'Demo', proxy => ('http://localhost:90/soap/server.pl'); my $demo = Demo->get(1); print Demo->string(1) . "\n"; Demo->string(1,"test 2"); print Demo->string(1) . "\n";

rdfield

Replies are listed 'Best First'.
Re: Large datastructure caching
by robartes (Priest) on Nov 16, 2002 at 11:56 UTC
    Lemme see if I get this - you're caching your object handles in the @handle array, which contains references back to your blessed hash. This sounds almost like flyweight objects from the book. I don't think converting this to a flyweight object implementation will gain you much speed or anything, but it will gain you some elegance, as you no longer have to pass $id around to your methods.

    Basically, a flyweight object is a blessed scalar which contains an index into an array of references to something (sound familiar? :) ). So, instead of passing $id around, why don't you make $id your object by blessing it, and put references the hashes containing {string} keys in an array, indexed by $id. Your constructor can then either take no argument, which creates a new entry in the array, or take an id to give you a handle on the object with that id.

    Something like this:

    use strict; package Demo; my @handles; sub new { my $thingy=shift; my $args=shift; my $id=$args->{'id'}; # Note that $args->{'id'} is autovivified here, but # this is no problem as we're testing for definedness # further on my $class=ref($thingy)||$thingy; unless (defined $id) {$id=scalar @handles}; my $self=bless $id, $class; $handle[$self]={string => $args->{'string'}}; return $self; }
    Note that this code is untested, and that I used hash style argument passing (new({id=>11, string=>"camel"})) in the constructor to account for the optional presence of id.

    I also have no idea how this plays with SOAP, as I have no experience on that matter.

    Your accessor functions then do no have to have $id passed to them, they can just use $self. This will gain you a minor improvement in elegance, for what it's worth :)

    CU
    Robartes-

      After a bit of work (not too much, this is Perl after all) just a couple of small changes to your code were required to make it work:
      my $self=bless $id, $class;
      becomes
      my $self=bless \$id, $class;
      and all references to $self as indicies to the array of handles become $$self. The calling code looks much more elegant and the internal calls are a bit better too.

      rdfield

      Update: I've simplified the internal calls a bit further...$self = $handles[${shift()}];and the rest of the code remains unchanged

      Update 2: Blessed Scalar references are not handled very well in the serialisation of SOAP transactions, so I've had to kludge the code as follows:
      Constructor:

      my $self = bless {id => $id},$class;
      Method:
      my $self = $handles[shift()->{id}];
      A blessed scalar - but of course!

      Elegance achieved, cheers robartes++

      rdfield

Re: Large datastructure caching
by rdfield (Priest) on Nov 16, 2002 at 10:03 UTC
    Actually the code shown doesn't show the worst part of the inelegance, which happens when one of the cached object handles tries to invoke a "private" method:
    sub _internal_method { my ($class,$id,@params) = @_; my $self = \%{$handles[$id]}; ...

    rdfield