Re^7: A faster?, safer, user transparent, shared variable "locking" mechanism.

Replies are listed 'Best First'.
Re^8: A faster?, safer, user transparent, shared variable "locking" mechanism. by BrowserUk (Patriarch) on Oct 27, 2006 at 08:39 UTC
I got the impression that you want Perl to automatically lock a variable whenever its being changed. I did, but I wasn't using traditional locks (as in semaphores or mutexes or spinlocks or any such similar beast), I was using Critical Sections. The idea was that whenever a reference to shared data was detected (by an exception raised due to access restictions on the memory from which the shared data is allocated), that the code tree would be back-tracked to the previous, mutating parent opcode, a CS would be entered, and the subtree redispatched. Under this scheme, any second and subsequent shared data access would be done within the original (only) critsec, and so would also be protected. That is to say, if all access to all shared data is serialised through the one Critical Section, then there is no "second lock" (or third or fourth) to try to acquire, so no deadlock can occur. That may sound prohibitively conservative, but it is analogous to the exist situation with iThreads. There was also (in my mind) the idea that the programmer could take some control by allocating shared variables to one or more "pools", declaratively: `my $var1 : shared('a') = 3; my $var2 : shared('a'); my $var3 : shared('b') = 10;` [download] Each pool would be serialised through a different CritSec, which would allow for finer grained seraliseation resulting in less contention, with the caveat that it would down to the programmer to ensure that no one 'opcode tree' attempt to use variables from two or more pools. That sounds horrendous (even to me:), but if you think of it that the default is for a single CritSec and no possibility of deadlocks--but course grained and potentially high contention. Once you get your code working, you can then attempt to optimise it by reducing contention. You start by allocating your shared variables to two more more pools based upon their use within the program. Ie. Two shared variables that are never both used in the same code path can be allocated to different pools. Eg. If you have shared vars that are only ever read by one thread and written by another, you can safely place these in a pool on their own. Multiple readers and one writer the same applies. This would work fine (I think) for shallow subtrees. But the idea runs out of steam(see the last para) when a mutation takes more than a few miliseconds, as the you mustn't stay in a critical section for longer :( Beyond removing the need for explicit locking, a secondary benefit to this scheme (had it been viable) is the reduction in the impact of threading upon non-threaded code. As non threaded code would never raise the exception, it would never pass through the shared data code paths. That would have clawed back a considerable amount of 5.6.x performance that was lost when threading support was added to 5.8. However, when applied to user controlled, multi-locking, the "always in the same order" maxim, is the right thing to do. Further, for many simple uses of threading; running slow processing (like sorts) in the background whilst GUI/CUIs remain responsive; adding timeouts or the ability to kill separate processes, network connections, server reads etc.; and a myriad of other 'do this slow thing, but give me the ability to do something else in the meantime' that require only minimal shared data to indicate the process is finished, or retrieve its results, `my $done : shared = 0; my $results : shared; async { $results = someLongRunningFunction(); ++$done; }; # do other stuff ... if( $done ) { ## use $results; }` [download] And `my $start = time(); my $content : shared; async{ $content = $lwp->get( $url ); }; sleep 1 while time() < $start +3; unless( defined $content ) { ## shutdown the socket } ## use the $content.` [download] For these, the whole issue of multiple locks and deadlocking never arises. Barring the programmer from these simple uses, because complicated one are difficult to get right seems silly. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^8: A faster?, safer, user transparent, shared variable "locking" mechanism.
by BrowserUk (Patriarch) on Oct 27, 2006 at 08:39 UTC

I got the impression that you want Perl to automatically lock a variable whenever its being changed.

I did, but I wasn't using traditional locks (as in semaphores or mutexes or spinlocks or any such similar beast), I was using Critical Sections. The idea was that whenever a reference to shared data was detected (by an exception raised due to access restictions on the memory from which the shared data is allocated), that the code tree would be back-tracked to the previous, mutating parent opcode, a CS would be entered, and the subtree redispatched.

Under this scheme, any second and subsequent shared data access would be done within the original (only) critsec, and so would also be protected. That is to say, if all access to all shared data is serialised through the one Critical Section, then there is no "second lock" (or third or fourth) to try to acquire, so no deadlock can occur.

That may sound prohibitively conservative, but it is analogous to the exist situation with iThreads. There was also (in my mind) the idea that the programmer could take some control by allocating shared variables to one or more "pools", declaratively:

my $var1 : shared('a') = 3;
my $var2 : shared('a');
my $var3 : shared('b') = 10;
[download]

Each pool would be serialised through a different CritSec, which would allow for finer grained seraliseation resulting in less contention, with the caveat that it would down to the programmer to ensure that no one 'opcode tree' attempt to use variables from two or more pools. That sounds horrendous (even to me:), but if you think of it that the default is for a single CritSec and no possibility of deadlocks--but course grained and potentially high contention.

Once you get your code working, you can then attempt to optimise it by reducing contention. You start by allocating your shared variables to two more more pools based upon their use within the program. Ie. Two shared variables that are never both used in the same code path can be allocated to different pools.

Eg. If you have shared vars that are only ever read by one thread and written by another, you can safely place these in a pool on their own. Multiple readers and one writer the same applies.

This would work fine (I think) for shallow subtrees. But the idea runs out of steam(see the last para) when a mutation takes more than a few miliseconds, as the you mustn't stay in a critical section for longer :(

Beyond removing the need for explicit locking, a secondary benefit to this scheme (had it been viable) is the reduction in the impact of threading upon non-threaded code. As non threaded code would never raise the exception, it would never pass through the shared data code paths. That would have clawed back a considerable amount of 5.6.x performance that was lost when threading support was added to 5.8.

However, when applied to user controlled, multi-locking, the "always in the same order" maxim, is the right thing to do.

Further, for many simple uses of threading; running slow processing (like sorts) in the background whilst GUI/CUIs remain responsive; adding timeouts or the ability to kill separate processes, network connections, server reads etc.; and a myriad of other 'do this slow thing, but give me the ability to do something else in the meantime' that require only minimal shared data to indicate the process is finished, or retrieve its results,

my $done : shared = 0;
my $results : shared;
async {
   $results = someLongRunningFunction();
   ++$done;
};

# do other stuff
...
if( $done ) {
    ## use $results;
}
[download]

And

my $start = time();
my $content : shared;
async{
    $content = $lwp->get( $url );
};
sleep 1 while time() < $start +3;
unless( defined $content ) {
    ## shutdown the socket
}
## use the $content.
[download]

For these, the whole issue of multiple locks and deadlocking never arises. Barring the programmer from these simple uses, because complicated one are difficult to get right seems silly.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]
[d/l]
[select]