in reply to Re: Re: Why use threads over processes, or why use processes over threads?
in thread Why use threads over processes, or why use processes over threads?

I'm not familiar with linux or most other unixes, but there would have to be at least the replication of filesystem objects -- duping of open file handles, sockets etc. and associated state.

I would imagine that it also requires the creation of handles to the existing memory objects in order to handle COW etc.

In addition, every write, which on the evidence of Abigail above, can frequently mean a perl-level read, will result in a memory copy operation (though I'm not sure what the granularity is). There will also be some amount of overhead associated with detecting writes to shred memory segments. Whether this is a software or hardware interupt, the effect upon L1 and L2 caches etc. can be expensive too.

It's unclear to me how forking handles other shared handles like DB connections, hardware connections to tape drives, serial ports and the like, but I think that it is probably down to the user to handle this rather than fork.

None of these things is individually expensive, but the convenience of spawning a thread, without requiring any of this is considerable. The greatest use, and the greatest benefit from threads is for performing asynchronous reads (from whatever). This use is simply not possible with forks. The select model just doesn't compare for usability, and event-driven models require you to throw away even standard structured programming techniques, never mind object-oriented models and revert to relying upon global state.

Finally, the benefits of co-routines are totally absent from the forking model, but are almost trivial to implement using threads.

I don't see threads and processes as an either/or proposition. In an ideal world, the programmer would have both spanners in his toolkit, and would be free to choose whichever is appropriate for the task at hand. For some tasks one is appropriate, for others, the other. In some cases a mixture of the two makes perfect sense, if the underlying system supports both efficiently. The best choice will sometimes be dictated by the underlying system.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!

  • Comment on Re: Re: Re: Why use threads over processes, or why use processes over threads?

Replies are listed 'Best First'.
Re: Re: Re: Re: Why use threads over processes, or why use processes over threads?
by Anonymous Monk on Nov 11, 2003 at 21:15 UTC
    The greatest use, and the greatest benefit from threads is for performing asynchronous reads (from whatever). This use is simply not possible with forks.

    I'm sure I must just be misunderstanding your point. Could you please expand on what you meant by this (with an example)?

      Giving an example, in perl, would be counter-productive as it would invite comparisons, that would only serve to highlight the shortcomings of the current implementation of perl's threading.

      In essence, in filter type applications, reading from a file, performing some processing, and then wrting to another file, much of the time is spent waiting on the kernel to complete IO. Throughput can be vastly improved by having a read thread, a processing thread and a write thread. Written correctly, this allows the processing thread to run at full speed, overlapping the processing with the IO.

      This processing model only works if the three threads can share the buffer space for input and output. Forking doesn't work for this as you then need to use IPC to communicate the data between the 3 processes, and instead of the processing thread having to wait on the reads and writes, it has to wait on the IPC. You've just moved the goalposts, not removed the waiting.

      Unfortunately, using the current implementation, even using pre-spawned threads, the underlying duplication/replication in perl's shared memory model, combined with the course granularity of the semaphoring, don't allow this model to be coded as efficiently, hence I won't provide sample code.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      Hooray!
      Wanted!

        You have to properly serialize reads, writes, and processing anyway, regardless of how you share the buffers - whether inside a single process between threads, or using shared memory between multiple processes. I don't understand why there should be any difference here.

        Makeshifts last the longest.

        So you are saying this is a case of theory being confounded by reality where Perl is concerned? Do you have any examples utilizing other languages that have both threads and fork and SysV IPC?