in reply to latest on ithreads vs forks?

Update: Incase it is unclear, this post refers ONLY to the situation when using ithreads.

(Though it's also worth remembering that under Win32, fork is emulated using ithreads under the covers.)

It's worth noting that unless you intend using a module in multiple threads, there is little point in loading that module prior to spawning your threads. It only consumes extra memory.

If you do need to use a module in multiple threads, remember that you cannot call methods across threads, or share objects. That is to say, if you create an object in one thread, and share it with another thread and then try to invoke methods upon the object in the second thread, it won't do what you want it to.

It is almost always better to require the modules needed by a thread, from within that thread, once it has started.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Replies are listed 'Best First'.
Re: Re: latest on ithreads vs forks?
by etcshadow (Priest) on May 27, 2004 at 02:13 UTC
    Actually, loading a module before fork will more likely save you memory. Forking uses a "copy on write" semantic. That is, the child process shares its memory pages with its parents until one or the other tries to write into that page. Only at that point does it make a copy of that page for the process that is trying to write.

    So if you take an application that has a fairly large number of modules (and the perl interpretter itself) sitting in pages that aren't going to get overwritten, compared to a fairly small number of pages that are getting actively written to, then preloading is going to save you memory (if a fair portion of the modules will be used in more than one of the forked processes). Not to mention the fact that it will also save you time.

    Anyway, here's an example. I've got a web server here with three processes: one is a controller process, and two are worker processes. In the first example, I preload a large number of the perl modules before forking. In the second example, the modules are loaded after forking.

      PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
    16701 me         0   0 35712  34M 31832 S       0  0.0  1.6   0:02 httpd
    18302 me         0   0 41400  40M 33100 S       0  0.0  1.8   0:04 httpd
    32301 me         0   0 41932  40M 33776 S       0  0.0  1.9   0:02 httpd
    
    
      PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
     2906 me         0   0  4608 4608  3052 S       0  0.0  0.2   0:00 httpd
     2907 me         0   0 35084  34M  5844 S       0  0.0  1.5   0:10 httpd
     2908 me         0   0 38804  37M  5880 S       0  0.0  1.7   0:11 httpd
    
    Paying careful attention to the SHARE column, you see that the total actual used memory is about 50 megs for the preloading and about 65 megs for the non-preloading. Also, you see that two seconds of processing time in the controller (parent) process when preloading that aren't in the controller process when not preloading. You can think of those two seconds as "shared" seconds, too, in much the same way as the shared memory (because two seconds happens to be about how long it takes to load the large corpus of perl modules involved here).
    ------------ :Wq Not an editor command: Wq

      What you discussed here is not exactly what BrowserUK mentioned. As he is one of those who has contributed lots here to the threading topic, be default, I trust that when he said thread, he really meant thread, not fork or child processes.

      I like the content of your node, no dount about that, but I think it is worth to clarify that you didn't strictly stay on the same subject as his.

        On the contrary, I'd say both of these posts are discussing memory use by Perl threads and how it compares to forking. Forking benefits from copy-on-write, while threads do not, meaning that forking tends to use much less memory. This also means that the approach for conserving memory as much as possible is exactly the opposite in each one: load as late as possible for threads and as early as possible for forking.
        Ah, brain-fart. He did say "in threads". It just goes to show that I come from a very process/fork intensive world (work a lot with apache 1.X which is all about processes and forks... likewise older perl (5.005_03) wherein threads were not really 100%).

        So I guess I was a little off-topic with my reply. Doesn't make it not true, though :-)

        Honestly, I didn't reallize that threads did not use shared memory. If anything, I'd have assumed that they used shared memory even more than processes. After all, in threads, the semantic is not copy-on-write... it's syncronize-on-write (so if it were just shared to begin with, there'd be no need to copy the data between threads' memory pages... just syncronize access when performing writes). But, anyway, as I said: I'm no thread-head. (Understand the concepts but very little actual experience with using them or the details of their implementations.)

        ------------ :Wq Not an editor command: Wq