Re^3: Parrot, threads & fears for the future.

in reply to Re^2: Parrot, threads & fears for the future.
in thread Parrot, threads & fears for the future.

I haven't really had a good look at Perl6 yet so I don't want to make any comments as to how well it adresses threading.

I was commenting that it's not the threads-forks-clusters that make a difference, it's being able to hint to the compiler that some code *could* be parallelised, and having a language that provides some high-level constructs when I want to force something to be in a separate thread.

Programming thread pools and syncronisers and stuff feels very similar to programming low level functions in C - it focusses on what I want the computer to do, rather than telling it what I want to get. So I'm definately in the "future is parallel, but probably not threaded" camp.

___________________
Jeremy
I didn't believe in evil until I dated it.

Comment on Re^3: Parrot, threads & fears for the future.

Replies are listed 'Best First'.
Re^4: Parrot, threads & fears for the future. by BrowserUk (Patriarch) on Oct 25, 2006 at 11:41 UTC
Programming thread pools and syncronisers and stuff feels very similar to programming low level functions in C - it focusses on what I want the computer to do, rather than telling it what I want to get. I utterly, totally, completely agree with that statement. Which is why I am so pleased to see that Perl6 has recognised the issues, has them covered, and already working! For (one possible flavour) of the underpinnings, that is not (yet) the case. So I'm definately in the "future is parallel, but probably not threaded" camp. There are a whole class of programs, somewhat characterised by the algorithms for which an entire class of specialist processor units (vector processors) have been built to deal with, that can benefit enormously from the multiple cpu/cores that are now becoming common place, but that require (or at least benefit enormously from) shared state, and cannot be as easily or as efficiently performed using processes (or clusters) as they can using threads. Whilst these algorithm are often described as being for scientific work, and more recently for graphical work (games & audio, for which graphics and sounds cards carry specialist GPUs & DSPs), those same and similar algorithms can also be used for much more mundane and everyday computing tasks once you have the processing power to utilise them. Three (recent, real-world) examples that have turned up or been referenced here at PM: Bioinformatics. Forking Multiple Regex's on a Single String Data mining Benign Web Miner. Statistical analysis for a web-baseed business. Want a million dollars? All three can be tackled using multiple processes (and by implication clusters), but all three have the need for, or can greatly benefit from, a feed back loop of status information and/or intermediate results to the parent execution context controlling the spawning. This can be achieved with processes via bi-directional IPC, but if done through pipes, sockets or message queues, the 1 to many/many to 1 reads and writes require both the parent and child to use non-blocking IO. That means every process also has to become a state machine. For clusters, the communications further involves the network and networking bandwidth, latency and topology issues This can be achieved through the file system, but that requires semaphores and/or file locking and non-blocking IO. Again, every process has to become a state machine. For clusters, the files will at some point be remote and so the networking issues are again a factor. A threaded solution needs locks, but in every other way is simpler, easier to debug and faster. No non-blocking IO, so no state machines required. No IPC, so no protocol delays. No fileIO so no disk locks, latency, contention or caching delays or issues. No networkIO, so no network latency, contention or topology problems. For that class of parallelisation problems for which shared state, or child to parent communications is either required or beneficial, no other solution comes even close to be as simple or efficient as threading. Once you remove the need for the application programmer to worry about locking, even at the penalty of some extra delays when using read-only references and the minor cost of a condition test on every access to shared data, the advantages far outweight the alternatives. I'm as yet undecided whether Software Transactional Memory (STM) is the right solution to taking locking out of the hands of the application programmer. Having done a fair amount of DB programming, including designing and writing the infrastructure for a unique, widely distributed, multi-transport, DB query mechanism, I've encountered the problems that transactions bring with them. If the granularity of the transactions is set too big, you kill your performance by blocking concurrent access and the costs of intermediate storage for rollback can get too high to manage. Set the granularity of the transactions too small, and the costs of rollback when it happens become prohibive, and/or you risk allowing the utilisation of out-of-date or rolled-back data. STM doesn't have the same issues of protocol and transmission latency usually involved with DB accesses, so this maybe a non-issue, but there is still enough doubt in my mind to cause me some concern. But got right, STM is one of several promising mechanisms for allowing the "big issue" with shared state to be taken out of the hands of the application programmer and clearing the way for simple, safe and ubiquitous threading. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]

Replies are listed 'Best First'.

Re^4: Parrot, threads & fears for the future.
by BrowserUk (Patriarch) on Oct 25, 2006 at 11:41 UTC

Programming thread pools and syncronisers and stuff feels very similar to programming low level functions in C - it focusses on what I want the computer to do, rather than telling it what I want to get.

I utterly, totally, completely agree with that statement. Which is why I am so pleased to see that Perl6 has recognised the issues, has them covered, and already working!

For (one possible flavour) of the underpinnings, that is not (yet) the case.

So I'm definately in the "future is parallel, but probably not threaded" camp.

There are a whole class of programs, somewhat characterised by the algorithms for which an entire class of specialist processor units (vector processors) have been built to deal with, that can benefit enormously from the multiple cpu/cores that are now becoming common place, but that require (or at least benefit enormously from) shared state, and cannot be as easily or as efficiently performed using processes (or clusters) as they can using threads.

Whilst these algorithm are often described as being for scientific work, and more recently for graphical work (games & audio, for which graphics and sounds cards carry specialist GPUs & DSPs), those same and similar algorithms can also be used for much more mundane and everyday computing tasks once you have the processing power to utilise them.

Three (recent, real-world) examples that have turned up or been referenced here at PM:

Bioinformatics. Forking Multiple Regex's on a Single String
Data mining Benign Web Miner.
Statistical analysis for a web-baseed business. Want a million dollars?

All three can be tackled using multiple processes (and by implication clusters), but all three have the need for, or can greatly benefit from, a feed back loop of status information and/or intermediate results to the parent execution context controlling the spawning.

This can be achieved with processes via bi-directional IPC, but if done through pipes, sockets or message queues, the 1 to many/many to 1 reads and writes require both the parent and child to use non-blocking IO. That means every process also has to become a state machine. For clusters, the communications further involves the network and networking bandwidth, latency and topology issues

This can be achieved through the file system, but that requires semaphores and/or file locking and non-blocking IO. Again, every process has to become a state machine. For clusters, the files will at some point be remote and so the networking issues are again a factor.

A threaded solution needs locks, but in every other way is simpler, easier to debug and faster.

No non-blocking IO, so no state machines required.
No IPC, so no protocol delays.
No fileIO so no disk locks, latency, contention or caching delays or issues.
No networkIO, so no network latency, contention or topology problems.

For that class of parallelisation problems for which shared state, or child to parent communications is either required or beneficial, no other solution comes even close to be as simple or efficient as threading.

Once you remove the need for the application programmer to worry about locking, even at the penalty of some extra delays when using read-only references and the minor cost of a condition test on every access to shared data, the advantages far outweight the alternatives.

I'm as yet undecided whether Software Transactional Memory (STM) is the right solution to taking locking out of the hands of the application programmer. Having done a fair amount of DB programming, including designing and writing the infrastructure for a unique, widely distributed, multi-transport, DB query mechanism, I've encountered the problems that transactions bring with them.

If the granularity of the transactions is set too big, you kill your performance by blocking concurrent access and the costs of intermediate storage for rollback can get too high to manage.

Set the granularity of the transactions too small, and the costs of rollback when it happens become prohibive, and/or you risk allowing the utilisation of out-of-date or rolled-back data.

STM doesn't have the same issues of protocol and transmission latency usually involved with DB accesses, so this maybe a non-issue, but there is still enough doubt in my mind to cause me some concern.

But got right, STM is one of several promising mechanisms for allowing the "big issue" with shared state to be taken out of the hands of the application programmer and clearing the way for simple, safe and ubiquitous threading.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]

In Section Meditations