punkish has asked for the wisdom of the Perl Monks concerning the following question:

Update: My mental block is cleared, and I have figured out most of my questions below, except for the part about initializing 'bar'
Apologies in advance if this turns out to be too basic a question, but it has got me all stymied.

This is how I want my design to be -- the object constructor gets a couple of input values, and creates a fully populated object. There are three things to consider:

  1. millions of objects are created, albeit not concurrently, during a program run
  2. some aspects of initialization of all the objects, all millions of them, are controlled by values in a set of init config settings
  3. a group of params in the objects are common to groups of objects, and since this group is a lot of values, they should be queried only once
Example pseudo code:
# In my long running program my $dbname = get_db_name_from_some_config_file(); my $dbh = DBI->connect("dbi:SQLite:dbname=$dbname","",""); my $user_id = 1; my $cfg = get_user_specific_config_values_stored_in_db($user_id); for my $obj_id (0 .. 2 million) { my $obj = new Obj(dbh => $dbh, cfg => $cfg, obj_id => $obj_id); $obj->very_long_complicated_program(); } # In a nearby module Obj.pm sub new { my ($class, %args) = @_; my $dbh = $args{dbh} my $cfg = $args{cfg}; my $obj_id = $args{obj_id}; my $in = { dbh => $dbh, cfg => $cfg, foo => {}, bar => {}, }; my $self = bless $in, $class; # foo is a result of a different query for every obj_id $in->{foo} = $self->foo($obj_id); # The following part is the only part I am still confused about. # Should I use something like Memoize here? # # bar changes only once per, say, every 5000 objects, # and the resulting query from $self->bar is a very # large set (thousands of rows), so it makes sense to # query the db only when bar changes. So, for obj_id # 0 through 4999, bar remains the same, then for obj_id # 5000 through 9999 a new bar is returned, and so on $in->{bar} = $self->bar($obj_id); return $self; } sub dbh { my $class = shift; return $self->{dbh}; } sub cfg { my $class = shift; return $self->{cfg}; } sub foo { my ($class, $obj_id) = shift; my $cfg = $self->cfg; my $dbh = $self->dbh; my $foo = query_dbh_for_foo_execute($obj_id); $foo = change_parts_of_foo_based_on($cfg); return $foo; } sub bar { my ($class, $obj_id) = shift; my $dbh = $self->dbh; my $bar = query_dbh_for_bar_execute($obj_id); $bar = change_parts_of_bar_based_on($cfg); return $bar; }

How do I go about implementing the above?

Update2: Another, possibly better way to ask my question -- how do I create an object a part of which (its data) is shared with other objects without requiring making a copy of that data?

--

when small people start casting long shadows, it is time to go to bed

Replies are listed 'Best First'.
Re: initializing objects from values from a db
by CountZero (Bishop) on Dec 05, 2009 at 20:19 UTC
    Another, possibly better way to ask my question -- how do I create an object a part of which (its data) is shared with other objects without requiring making a copy of that data?
    Save the recurring data in an (anonymous) variable and store a reference to that variable in your object. But don't forget to dereference the reference if you need to access the data!

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      Q1. What is an "anonymous" variable?

      Q2. So, I would query the db and create this anonymous variable in my script outside Obj.pm module, before the object is created, then pass a reference to it when creating the object? Then, I would keep track of when a new object would require a new version of this information (after 5000 objects with the current info in "cache" have been created and used), and then refresh this anonymous variable from new info from the db? Is that what you are suggesting?

      Update: Following up on my Q2 above, I can only think of how to do this as stated above. However, I would prefer to tuck everything inside my Obj.pm, so that the anonymous variable is created inside the class, and perhaps a class method keeps track of how it is used and allocated. Wouldn't that be cleaner and more obj-oriented, so to say?

      --

      when small people start casting long shadows, it is time to go to bed
Re: initializing objects from values from a db
by punkish (Priest) on Dec 06, 2009 at 04:29 UTC
    Here is what I did, benchmarked, and it works.

    package Obj; my %bar_cache; sub new { my ($class, %args) = @_; my $dbh = $args{dbh} my $cfg = $args{cfg}; my $obj_id = $args{obj_id}; my $in = { dbh => $dbh, cfg => $cfg, foo => {}, bar => {}, }; my $self = bless $in, $class; # foo is a result of a different query for every obj_id $in->{foo} = $self->foo($obj_id); # The following part is the only part I am still confused about. # Should I use something like Memoize here? # # bar changes only once per, say, every 5000 objects, # and the resulting query from $self->bar is a very # large set (thousands of rows), so it makes sense to # query the db only when bar changes. So, for obj_id # 0 through 4999, bar remains the same, then for obj_id # 5000 through 9999 a new bar is returned, and so on $in->{bar} = $self->bar($obj_id); return $self; } sub bar { my ($class, $obj_id) = shift; if ($self->{bar}) { return $self->{bar}; } else { if (exists $bar_cache{$obj_id}) { $self->{bar} = $bar_cache{$obj_id}; } else { my $dbh = $self->dbh; my $bar = query_dbh_for_bar_execute($obj_id); # Empty bar_cache of old values %bar_cache = (); $bar_cache{$obj_id} = $bar; $self->{bar} = $bar_cache{$obj_id}; } return $bar; } }

    Since there are more than one ways to do the above, any gotchas that you all see that I should be aware of?

    --

    when small people start casting long shadows, it is time to go to bed