To fork() or not to fork()

asiufy has asked for the wisdom of the Perl Monks concerning the following question:

I've an application like the following:

3 different pieces, that run should simultaneously. One is the actual application logic (taking input from a database, so it scans the DB every xx seconds for new entries), the 2nd is a "timeout" engine, that verifies if a specific user/session should be timed out, and the 3rd is a statistics generator, based on the usage of the application by the users.

All these 3 pieces have to run simultaneously, but they were coded as independant daemons.

In a sudden burst of creativity (or stupidity?) I decided to have each one of them be fork()ed off a main script, basically to see if I could. So right now it's working like that, one executable that starts the whole thing. It's running fine that way (no zombies, etc) but I'm still in the development stages, so I don't know what kind of implications this kind of solution will have in "real life" deployment.

Anyway, my question is: should I bother going this way (fork() each piece off a main app), or just run each piece individually, since there is no apparent loss in doing so (is there any GAIN?). I realise that if I go the fork() way, I will have to add better handling of the 3 children. I also read something about having to kill the children every so often and re-spawn it again, otherwise no memory would be freed.

I appreciate any input. I'm not posting any code because this is more of a generic question than a specific one, but if necessary I can post snippets.

cheers. alex.

Comment on To fork() or not to fork()

Replies are listed 'Best First'.
Re: To fork() or not to fork() by pg (Canon) on Nov 05, 2003 at 16:47 UTC
I don’t see a need for fork in this case, as those three things are not quite related. It is just fine to run them separately. You didn’t ask, but for your first process, the one to gather changes to database, you can simply use database trigger.	[reply]
Re: Re: To fork() or not to fork() by asiufy (Monk) on Nov 05, 2003 at 18:30 UTC
pg: thanks for your input. I'm using the database as a queueing mechanism, since my application receives input from many different systems, and I just have thin front-ends grab this input and put an entry in the database. That was the solution I found, lacking a proper queueing system (like MQ). Your suggestion just made me realise I could have the front-ends send UDP signals over to my Perl process, and that will in turn fetch the entries from the database (but only once the UDP signals arrive, not every xx secs as I do now). Unfortunately the front-ends are mostly written in C, and I'm not the one keeping that code ... Anyway, excellent suggestion (++)! I'm using MySQL, btw. I don't think there's anything like Oracle's database trigger in it...	[reply]
Re: To fork() or not to fork() by ehdonhon (Curate) on Nov 05, 2003 at 23:00 UTC
Questions to consider: Is there ever a time when its valid for one of these things to run without the other? If yes, that's an argument that they should be seperate programs, or at least the same program with command line options. If no, then it might be better to have them all fork from one process. That way if one dies, the default child and hup signal handlers will kill the others. Can you gain any efficency having all the code run under one process? If each of your programs has a very high startup cost, it might be more efficent if you only have to start it once. Is shared memory useful at all? Granted, you don't get real "shared memory" when you fork(), but on many operating systems, you do get copy-on-write memory, which means if you have a lot of memory that you only need to populate once and then read only, you could reduce memory usage by having just one program that does forks. What is the easiest to maintain? If all you are doing is dumping three big programs into one huge program, you are going to make it harder on yourself later on. However, if this gives you opportunity to re-use code, it could be a bonus for you. I'm sure there are more points to consider, but those are a few just off of the top of my head.	[reply]
Re: To fork() or not to fork() by mcogan1966 (Monk) on Nov 05, 2003 at 19:47 UTC
I've been dealing with the multiple-process issue myself. I have a couple of nodes that might be of interest. They might give you some insight on the issue. Concurrent Processes IO::Select - is it right for this? Hope these help	[reply]
Re: To fork() or not to fork() by Abigail-II (Bishop) on Nov 05, 2003 at 16:50 UTC
Uhm, what's the difference? If you don't use fork() explicitely, how do you start the three parts? If you do a `system "daemon &"`, you are forking too, except that's handled by `perl` instead of your code. I also read something about having to kill the children every so often and re-spawn it again, otherwise no memory would be freed. If you have a memory leak and what to cope with that by restarting the daemon, does it matter that you used explicite forks or not? If they leak, they leak, and you have to fix that (by either fixing your code, or restarting the daemon (`exec $0 => @ARGV`)), regardless how the daemons were started. I think the interesting question is "do we run as 3 different processes/LWPs/threads at let the kernel sceduler do our context switches" or "do we run as a single process/LWP/thread and do our own sceduling". Both have advantages and disadvantages. Abigail	[reply] [d/l] [select]