in reply to Re: Re: Re: Re: Why use threads over processes, or why use processes over threads?
in thread Why use threads over processes, or why use processes over threads?

Giving an example, in perl, would be counter-productive as it would invite comparisons, that would only serve to highlight the shortcomings of the current implementation of perl's threading.

In essence, in filter type applications, reading from a file, performing some processing, and then wrting to another file, much of the time is spent waiting on the kernel to complete IO. Throughput can be vastly improved by having a read thread, a processing thread and a write thread. Written correctly, this allows the processing thread to run at full speed, overlapping the processing with the IO.

This processing model only works if the three threads can share the buffer space for input and output. Forking doesn't work for this as you then need to use IPC to communicate the data between the 3 processes, and instead of the processing thread having to wait on the reads and writes, it has to wait on the IPC. You've just moved the goalposts, not removed the waiting.

Unfortunately, using the current implementation, even using pre-spawned threads, the underlying duplication/replication in perl's shared memory model, combined with the course granularity of the semaphoring, don't allow this model to be coded as efficiently, hence I won't provide sample code.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!

  • Comment on Re: Re: Re: Re: Re: Why use threads over processes, or why use processes over threads?

Replies are listed 'Best First'.
Re^6: Why use threads over processes, or why use processes over threads?
by Aristotle (Chancellor) on Nov 12, 2003 at 03:33 UTC
    You have to properly serialize reads, writes, and processing anyway, regardless of how you share the buffers - whether inside a single process between threads, or using shared memory between multiple processes. I don't understand why there should be any difference here.

    Makeshifts last the longest.

      Under Win32, and probably other systems that support threads, there are several APIs or performing synchronisation. Each of these has a different set of costs and benefits.

      For example, the process of synchronising access to a shared buffer in the above scenario might use either Mutex objects or Critical Section objects.

      The former, in common with SysV Semaphores, uses named, shared, system memory to store state, as is required for inter-process signalling. Access requires a context shift into kernel space. A relatively costly process.

      The latter, uses handled, process-space memory to store state. This is only usable within a single process, but is usable by all the threads of that process. Access, though managed by system API's is entirely within user space and doesn't require the context shift, and hence is faster.

      Critical section objects provide synchronization similar to that provided by mutex objects, except that critical section objects can be used only by the threads of a single process. Event, mutex, and semaphore objects can also be used in a single-process application, but critical section objects provide a slightly faster, more efficient mechanism for mutual-exclusion synchronization (a processor-specific test and set instruction). Like a mutex object, a critical section object can be owned by only one thread at a time, which makes it useful for protecting a shared resource from simultaneous access.

      Note. I can only view this from the perspective of Win32 & OS/2. There may well be similar facilities available in threaded unix kernels.

      For the buffers involved in the read-process-write scenario, it can be possible to use an even simpler mechanism. If a circular buffer, or circular linked-list is used to buffer between the threads, then it is possible to syncronise between them using one or two pointers in process memory relying on the atomicity of integer increments and decrements where this can be replied upon, or coordinating access to these using Critical Sections. This can be cheaper still.

      See Interlocking variable access & Interlocking Single-linked lists for better descriptions.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      Hooray!
      Wanted!

Re: Re: Re: Re: Re: Re: Why use threads over processes, or why use processes over threads?
by Anonymous Monk on Nov 11, 2003 at 23:40 UTC

    So you are saying this is a case of theory being confounded by reality where Perl is concerned? Do you have any examples utilizing other languages that have both threads and fork and SysV IPC?

      I don't have access to any OS other than Win32 currently.

      I could produce examples written in C for this platform that demonstrate that thoughput can be increased (on this platform) using a 3-thread process over a single threaded process, despite the OS providing considerable cacheing. However, as I cannot match that with a fork and IPC example, there is little point.

      I was party to some comparisons done using AIX showing that threading could outperform seperate processes with shared memory buffers and system semaphores. The nature of the task described, 3-threads, 2 buffers, each buffer written by one thread and readonly to a second is probably the strongest example, by reason of it's simplicity, for the threaded model.

      However, there are many variations where input from one or more external sources is subject to wait states, and the data needs to be centrally processed, where threading has an advantage over the forked model.

      There are, of course, many counter examples where forking is hands down winner on OS's where forking is a system level facility. And that's the bottom line. If the OS supports one natively, that will be the best option over any user space implementation of the other.

      Where an OS supports both, using threads where IPC is required will usually win over shared process memory because it is possible to use much lighter semaphore implementations. Where there is no requirement, or only trivial volumes, of IPC required, and the OS has many years of development into making forking fast and efficient, the forking model will prevail.

      Then the waters muddy further when you start considering non-native threads. Personally, I don't think it makes a great deal of sense to implement threading at the application level, ie. outside or alongside the OS schedulers control, as with the java threading. The result is a mish-mash that sees two disparet systems interacting to control the scheduling. The results of that interaction are extremely difficult to predict or manage.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      Hooray!
      Wanted!