Re: Why use threads over processes, or why use processes over threads?

Update: please note I'm talking about threads in the general sense. See my clarification on Perl threads.

I don't see the appeal of threads. Modern kernels on CPUs with modern MMUs can fork processes with very little effort and switch them pretty quickly. I expect the performance will increase further with time.

I don't even care that much about the performance argument - but it used to be a big problem once upon a time long past, so I thought I should get that out of the way first.

And then there's the real argument: safety. Forked processes default to not sharing; threads default to sharing everything. With the former, you have to explicitly share what you desire to be shared while with the latter you have to explicitly make thread local copies of sensitive data.

Every person in their right mind will tell you that the correct approach to security is to disallow by default and exempt desired interactions. Every Perl programmer worth their salt untaints data by denying anything but explicitly permitted input. The list goes on; the correct approach is always to disallow by default and explictly permit where desired.

Threads break this fundamental principle.

This is hard to argue with - by using threads you inevitably expose yourself to potential for all sorts of bugs. Since correctness is the primary concern in software development, and all else is secondary, I don't really see any choice but forking.

The current state of affairs is not perfect of course; shared memory or other forms of IPC are harder to use in practice than they ought to be.

Makeshifts last the longest.

Comment on Re: Why use threads over processes, or why use processes over threads?

Replies are listed 'Best First'.
Re: Re: Why use threads over processes, or why use processes over threads? by Anonymous Monk on Nov 11, 2003 at 07:35 UTC
And then there's the real argument: safety. Forked processes default to not sharing; threads default to sharing everything. With the former, you have to explicitly share what you desire to be shared while with the latter you have to explicitly make thread local copies of sensitive data. Perhaps you haven't actually looked at the docs for threads? To wit: It is very important to note that variables are not shared between threads, all variables are per default thread local. To use shared variables one must use threads::shared.	[reply]
Re^3: Why use threads over processes, or why use processes over threads? by Aristotle (Chancellor) on Nov 11, 2003 at 19:11 UTC
Actually, I know (and knew) about this and thought about it while writing my reply; but I was talking about threads in the general sense, not the threading model found in Perl 5.6+. For all intents and purposes, "threads" in 5.6+ are userland forks. So in Perl I have the choice between kernel forks and userland forks (whose performance and memory use is as is to be expected - just what kernel forks used to be once upon a time) - now guess which ones I'll prefer. At least on sane OSes where kernel forks are available.. Makeshifts last the longest.	[reply]
Re: Re: Why use threads over processes, or why use processes over threads? by castaway (Parson) on Nov 11, 2003 at 07:41 UTC
Forked processes default to not sharing; threads default to sharing everything. Um, no.. Unless you're talking about the 5005threads, they default to sharing everything. The newer ithreads default to sharing nothing (everything gets copied at the point you start the thread, after that each thread only uses its own copy of the variables). Personally, I found the 5005 behaviour easier to program with than the new one, but that's possibly just me.. C.	[reply]
Re: Re: Re: Why use threads over processes, or why use processes over threads? by Anonymous Monk on Nov 11, 2003 at 08:15 UTC
Personally, I found the 5005 behaviour easier to program with than the new one, but that's possibly just me.. Nope, not just you. The 5005 way was, more or less, following the standard model of sharing found elsewhere. If I had to describe the new thread model (and I've had to on a few occassions) it is as if it were a slow, bloated fork emulation but with convenience functions for synchronizing access to shared data.	[reply]
Re: Re: Why use threads over processes, or why use processes over threads? by hardburn (Abbot) on Nov 11, 2003 at 15:28 UTC
Modern kernels on CPUs with modern MMUs can fork processes with very little effort and switch them pretty quickly. Win NT (any version) processes are quite expensive, and a significant amount of Perl code runs on Win2k web servers. This is a stark contrast with Linux, where processes are very cheep (they have to be, since Linux threads are almost identical to processes). Forked processes default to not sharing; threads default to sharing everything . . . correct approach to security is to disallow by default. I don't think you can make the analogy between taking user input and a process/thread model. In taking data from (for example) a CGI form, you usually have no idea where the information is coming from, so it is reckless to not validate it. In sharing data between processes, you presumably control everything that happens between the two processes. The data shared is no less untrustworthy than the data you pass between subroutines. If you happen to be in a situation where you don't control what happens in one of the procesess or threads, then you definatly need to do validation. However, I doubt such a situation pops up much. ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer `: () { :\|:& };:` Note: All code is untested, unless otherwise stated	[reply] [d/l]
Re^3: Why use threads over processes, or why use processes over threads? by Aristotle (Chancellor) on Nov 11, 2003 at 19:19 UTC
Win NT (any version) processes are quite expensive Windows doesn't even have forks, so the point is moot anyway. Notice how the Perl 5.6 thread model was largely an attempt to emulate fork() for platforms which don't have it (even if that wasn't its stated goal, it certainly made that impression). In sharing data between processes, you presumably control everything that happens between the two processes. "Presumably" being the keyword, because this is about the effect of a) bugs and b) security holes. With threads, both occurences can kill off your entire application. With forked processes, they can only affect the child in question except where a resource is explicitly shared. Makeshifts last the longest.	[reply]