TOD has asked for the wisdom of the Perl Monks concerning the following question:

hi all,

i'm working on a daemon process, whose purpose lies in the cacheing of nested data structures and thus synchronizing the data between several processes. the daemon communicates with its clients via tcp/ip connections, it receives the data to be cached in serialized form (via Storable.pm). the process consists of mainly two threads (plus a logging one). the main thread listens on its socket, receives requests, performs all necessary actions, and sends the responses. the second thread periodically synchronizes the cached data with the respective files on disk. due to this construct all data have to be shared, e.g.:
our %planets : shared; our $response = SomeClass::TCP_IP_Response->new(@some_arguments); [...] lock %planets; if (exists $planets{$some_id}) { lock $planets{$some_id}; $response->content($planets{$some_id}->{'data'}); $planets{$some_id}->{'atime'} = time; } else { my $data; [...] # read the serialized data from disk my %el = ( 'data' => $data, 'atime' => time, 'deleted' => 0, 'modified' => 0 ); $planets{$some_id} = share(%el); }
what is confusing me here is that the manpage for threads::shared says: "C<share> will traverse up references exactly I<one> level. C<share(\$a)> is equivalent to C<share($a)>, while C<share(\\$a)> is not. This means that you must create nested shared data structures by first creating individual shared leaf notes, then adding them to a shared hash or array." does this mean, that in my example the $data value form %el has to be explicitly shared as well, or will it become shared automatically, since it's a scalar value and part of %el?

maybe a silly question, but i'm really stuck at the moment. many thanks in advance.
--------------------------------
masses are the opiate for religion.

Replies are listed 'Best First'.
Re: a question on sharing data structures across threads
by BrowserUk (Patriarch) on Oct 08, 2007 at 06:26 UTC
    does this mean, that in my example the $data value form %el has to be explicitly shared as well, or will it become shared automatically, since it's a scalar value and part of %el?

    Whenever you use threads::shared::share() on a hash or array, any previous contents of that hash or array are silently discarded. It's a PITA but "working as designed". That is, it is operating the way the originators intended share() to work.

    To achieve your aim, it is necessary to explicitely share every compound data structure (array or hash) prior to populating it.

    One way of achieving this is to use an existing nested structure traversal utility (eg. Data::Rmap), and copy the data at each level into an appropriate shared structure before assigning a reference to the shared copy to the shared parent.

    It is slow and gets messy to do this yourself. I've had a couple of goes at doing this in the past, but it is really something that should be done once, internally, rather than having to be recreated by each programmer.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      ok, thank you for that information. another question: if i declare my $data : shared within an if { } block, does the shared value go out of scope after the block? will i have to explicitly call something like $planets{$some_id}->{'data'} = share($data); to keep the value?
      --------------------------------
      masses are the opiate for religion.
        if i declare my $data : shared within an if { } block, does the shared value go out of scope after the block?

        Yes, it will go out of scope--but it will not cease to exists if

        1. anything holds a reference to it.
        2. if it is a reference and you have assigned it to something who's scope is wider.

        That's not a good explanation. The problem is it depends upon what $data is (contains)?

        • If $data is a simple scalar value, then there is no need to share() it in order to assign it to a shared data structure.

          This is because assigning a simple scalar just copies its value into the (shared) destination.

        • However, if $data is a reference to a hash or array (or nested structure of either or both), then simply sharing that reference is not enough.

          This is because (as indicated above) when you share a reference to a compound datastructure (hash or array), it will (silently) empty the referenced structure!

          For example:

          use threads; use trheads::shared; ## Non-shared hash assigned some data: my %d = (1 .. 10 ); print Dumper \%d; $VAR1 = { '1' => 2, '3' => 4, '7' => 8, '9' => 10, '5' => 6 }; ## Share a reference to that hash and assign it to a shared scalar my $r:shared = share( %d ); ## And not only will the reference point to an empty hash print Dumper $r; $VAR1 = {}; ## But also the contents of the original hash will have been silently +discarded print Dumper \%d; $VAR1 = {};

        So, if as suggested by your OP code, $data contains a reference (as returned by Storable::retrieve(), then you will need to copy the contents of the referenced thing into a shared equivalent.

        To illustrate (Please read the comments carefully!):

        our %planets : shared; our $response = SomeClass::TCP_IP_Response->new(@some_arguments); [...] lock %planets; if (exists $planets{$some_id}) { lock $planets{$some_id}; $response->content($planets{$some_id}->{'data'}); $planets{$some_id}->{'atime'} = time; } else { my $data; [...] # read the serialized data from disk ## Assuming $data is a reference to a *simple*, *single-level* hash ## Copy the data into a shared equivalent; my %sharedHash :shared = %{ $data }; ## And then assign a reference to the *shared* copy. # my %el = ( 'data' => \%sharedHash, # 'atime' => time, # 'deleted' => 0, # 'modified' => 0 # ); ## But if you do this, *ALL THE CONTENTS OF %e1 WILL BE DISCARD* # $planets{$some_id} = share(%el); ## So make the local datastructure shared also my %el :shared = ( 'data' => \%sharedHash, 'atime' => time, 'deleted' => 0, 'modified' => 0 ); ## Now you do not need to use shared() ## as a reference to a shared object is shared. $planets{$some_id} = \%el; }

        And once you've placed a reference to a local (lexical) variable (shared or not) into a variable who's scope is wider, the contents of that variable (but not the name) will persist (not be GC'd) as long the reference persists.

        But also note that if $data is a reference to a hash or array that contains nested references to other hashes or arrays, then you will also need to copy this nested structures manually. That is where a recursive structure traversal tool is needed.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: a question on sharing data structures across threads
by TOD (Friar) on Oct 09, 2007 at 12:08 UTC
    apparently we are digging deeper and deeper into the bouncing heart of perl itself with our discussion. i've just tested your workaround, and it worked quite fine - until:
    foreach (keys %planets) { lock %{$planets{$_}}; [...] if ($planets{$_}->{'atime'} < $time - $some_limit) { delete $planets{$_}; # remove the item from cache } }
    obviously this cannot work, since we're having a lock on the variable in the very moment when we try to delete it, and consequently the result is: panic: MUTEX_LOCK (22) ( shared.xs:90), and as i recall to have read somewhere that lock() isn't atomic, it would be no threadsafe solution if we put the delete() outside of the lock block. seems to me that locking the whole hash/ array is the only realistic possibility for the moment. but maybe some fine day in future one of the perl developers happens to stumble on this thread... ^^
    --------------------------------
    masses are the opiate for religion.
      locking the whole hash/ array is the only realistic possibility for the moment. but maybe some fine day in future one of the perl developers happens to stumble on this thread.

      Actually, if you think about it, locking the whole hash is the only logical thing to do when modifying the hash, rather than updating things pointed at by its values.

      Grr. Once again, that is about as clear as mud.

      There are two types of changes that can be made to a hash:

      1. Those that modify the contents of substructures pointed to from the top-level hash:

        Modifying (via reference) either the contents of %el hash or $scalar_data in your previous example fit into this category.

        For these, locking the sub-hash or scalar respectively is enough as the structure of the top-level hash does not change.

      2. Those that modify the structure of the hash itself:

        Adding a new key/value pair, or deleting an existing key/value pair, and even changing the value of an existing key/value pair fit in this category.

        For these operations, it is essential that the top-level hash be locked as they change the structure of that hash itself.

        For example, adding a new key/value pair could cause the entire hash to be expanded, which involves doubling the number of buckets and re-hashing every key. Having other threads attempt to do anything with the top-level hash whilst that type of operation is in progress is a obvious no-no.

        Equally, deleting a pair will affect the operation of the each/keys/values iterators and could result in strange results unless the entire hash is locked.

        The possible consequences of modifying an individual pair value is more subtle, but if you think it through, you can see the window of opportunity for errors.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.