in reply to shared complex scalars between threads

If you don't already have them, upgrade your threads and threads::shared modules to teh latest cpan versions. recent version have the ability to share objects:

#! perl -slw use strict; use threads; use threads::shared; use Junk; print $threads::VERSION; print $threads::shared::VERSION; sub thread { my $obj = shift; my $tid = threads->self->tid; sleep $tid; $obj->add( $tid, time() ); $obj->dump; return; } my $obj = Junk->new( abc => 123, pqr => 456 ); my $shrObj :shared = shared_clone( $obj ); my @threads = map threads->create( \&thread, $shrObj ), 1 .. 10; sleep 11; $shrObj->dump; $_->join for @threads; __END__ c:\test>junk5 1.71 1.26 bless({ # tied threads::shared::tie 1 => 1224113391, abc => 123, pqr => 456, }, "Junk") bless({ # tied threads::shared::tie 1 => 1224113391, 2 => 1224113392, abc => 123, pqr => 456, }, "Junk") ...

Junk.pm

package Junk; use Data::Dump qw[ pp ]; sub new { my $class = shift; return bless { @_ }, $class; } sub dump { my $self = shift; pp $self; } sub add{ my( $self, $key, $value ) = @_; $self->{ $key } = $value; return; } 1;

I haven't done much with this ability, and I don't know how effective it is at sharing more complex objects, but it is worth a try. From a quick scan of teh POD and inside the module I don't get how it works so I can't even make a prediction.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: shared complex scalars between threads
by gone2015 (Deacon) on Oct 16, 2008 at 00:17 UTC

    I had a poke at shared objects a little while ago.

    You can share an arbitrarily complicated structure, with refs to arrays, hashes and scalars, to any depth you like. The tricky bit, I found, was that with refs to shared arrays and hashes you have to create an empty anonymous array/hash, mark it shared and then populate it.

    With objects the problem was that this meant that the object maker had to know to construct a shared object.

    As you say, the late model threads::shared claims:

    shared_clone REF

    shared_clone takes a reference, and returns a shared version of its argument, preforming (sic) a deep copy ...

    so, an object can be made shared after the event, now. But of the class adds new hashes or arrays to the object, without knowing they too need to be made shared, "fun" will ensue -- sure as eggs is eggs and this is not a pipe (also, no spoon). Not to mention the small issue of managing shared access to the object components.

    I'd be interested to hear how that goes !

      I'd be interested to hear how that goes !

      Me too. I have, (and have expressed here before), misgivings about this feature, but it is too late in the day my time, and I am too wrapped up in my own code right now to fathom enough about the OPs problem and requirements to be able to construct a meaningful test. He is going to have to do that for himself.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: shared complex scalars between threads
by Anonymous Monk on Oct 16, 2008 at 15:24 UTC
    Thanks, I'll try that and let you guys know if it worked. If this works then I think it's preferable to having the complex object run in a separate process and exposed via an API since this would in effect serialize the access to it when several threads make a call to the API and request access to that object. I am trying to increase concurrency as much as possible without creating multiple copies of that large Bayesian model. Let's hope shared_clone works...
      I am trying to increase concurrency as much as possible without creating multiple copies of that large Bayesian model.

      Unfortunately, I don't think it will. I got to do a little more experimenting this morning and it seems that creating a shared object this way, (using threads::shared::bless()), simply replicates everything it contains for each thread that gets a handle ;( Sorry if I've wasted your time on this, but I thought that they were doing something clever, but it seems that is not the case. I cannot believe anyone thought this was a useful idea.

      The next possibility is to create the big object in a single thread and then have your other threads use it client-server fashion, passing messages detailing the request, and waiting for the reply. This could be done through a queue or individual shared scalars (or even sockets), but whichever way, the requests would effectively be serialised. If the requests are fairly long running and you are hoping to run multiple concurrent requests on different cpus, it isn't going to happen.

      I do not currently have a solution to offer you. One possibility involves modifying the internals of Algorithm::NaiveBayes to separate the bulk of the data (currently stored as a complex attribute of the object itself), from the rest of the objects internals and then attempt to have multiple instance of the object share a single shared copy of that bulk of data without cloning. I have a few ideas on this, but nothing that ready for prime time.

      Once again. Sorry for any time you have wasted through the bum steer. I should have gone with my gut and stuck with my original misgivings about this feature.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.