dshahin has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a piece of code for a customer that forks a copy of itself.
here is the (edited) output of top:
  
PID   USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
29943 dshahin    0   0  4808 4808  4476 S       0  0.0  1.8   0:00 MIO2
29940 dshahin    0   0  4792 4792  4468 S       0  0.0  1.8   0:00 MIO2
my question is regarding the 'SHARE' column.
Does this mean that additional process only uses 'SIZE - SHARE' kilobytes of new memory? The customer is concerned (rightly so) about over-utilizing system resources. They might run many copies simultaneously, will each only eat up 'SIZE - SHARE' k of 'fresh' memory?

I want to be able to tell them there is a 5 meg initial hit, but each new process only uses less than 1meg new memory? Am I lying?:)

A lot of this memory usage is from IO::Socket::SSL, my boss wants me to write it in C to bring down the memory usage, but it will take much longer and be more prone to errors.

What is a better solution?

Any general memory-saving tips are appreciated.

  • Comment on collective unconcious (about shared memory...)

Replies are listed 'Best First'.
(tye)Re: collective unconcious (about shared memory...)
by tye (Sage) on Apr 17, 2001 at 07:37 UTC

    On most modern versions of Unix, all new processes will initially share all (or nearly all) of their memory with their parent (each new process gets its own new list of page table entries that define how to map virtual memory address ranges into real memory and/or swap space pages). After that, when any of the related processes writes to a shared page of memory, a fault will occur which will cause that page of memory to be copied to a page that is local to the writing process.

    So you have to let your fork()ed processes run for a while before you can get a feel for how much memory they will manage to continue to share.

    Since IO::Socket::SSL eventually uses some C code, you should be sure to put that code into a shared library. That way that chunk of memory will always remain shared between all processes (even if they aren't related) since the shared library is loaded using mmap (or an equivalent).

    Perl code compiles to an opcode tree. I'm curious if Perl manages to leave large chunks of this tree as read-only so that related processes can continue to share that memory.

            - tye (but my friends call me "Tye")
      Perl code compiles to an opcode tree. I'm curious if Perl manages to leave large chunks of this tree as read-only so that related processes can continue to share that memory.

      My knowledge of *nix kernel memory management is stale at this point, so take this with a grain of salt.

      It used to be the case that the read-only pages were so marked in the executable file. The executable also contained some number of writeable data pages, and info for creating a "blank static storage" section to house uninitialized structures. Additional storage (e.g., allocated by malloc()) grew upwards from writable storage via sbrk(), and the stack grew downwards from high memory.

      Perl process build their opcode trees in writeable space. I'm not aware of a mechanism to mark writeable pages as read-only at runtime.

      However, depending on how fork() is implemented, writeable pages from the parent process may shared with the child process using a "copy on write" scheme, in which pages are shared until one of the process tries writes into the page, at which point a copy is made for each process. Such a scheme would work well with Perl, allowing the opcode tree to be shared.

      Now, will somebody please update me by telling me how this is wrong?

        What I was describing was "copy on write" (I probably should have at least mentioned that term, eh?). I must have done a poor job of describing it if you didn't recognize it. Sorry.

        My question was whether code that is compiled before the fork() would remain shared (for very long) after the fork() or whether Perl updates things (like reference counts or state information) in the opcode tree itself such that Perl running the code causes the opcode tree to be written to and that memory to become unshared.

        I recall this coming up a long time ago and vaguely recall that the opcode tree was written to and someone thinking about trying to change that. I wonder if that memory is at all accurate and if any changes were made.

        BTW, your description of memory architecture matches my understanding of the common layout. Perhaps one missing piece is how mmap() fits in. mmap() was a great idea that unified how the page file (a.k.a. swap space) worked with shared memory and I/O buffers.

                - tye (but my friends call me "Tye")
      >Since IO::Socket::SSL eventually uses some C code, you should be sure to put that code into a shared library. That way that chunk of memory
      >will always remain shared between all processes (even if they aren't related) since the shared library is loaded using mmap (or an equivalent).

      I'm not sure what you mean by this.

      So parents and children share memory, but what about invocations of the Perl interpreter?
      That is, if I invoke the same program from different shells, do they share memory? They must use the same shared libs, right?

      dshahin

        Yes, one of the points behind shared libraries is that the read-only sections of those libraries get shared by all processes that use the same library (even if the processes aren't related).

        The parent and child [after fork() but before exec()] have the potential to share a lot more memory.

                - tye (but my friends call me "Tye")
Re: collective unconcious (about shared memory...)
by reyjrar (Hermit) on Apr 17, 2001 at 04:44 UTC
    If you're using shared memory, what does the ipcs look like? maybe I'm crazy, and that's very possible, but Using IPC::Shareable as long as you use the same "glue" your data will share the memory space alotted.. This may or may not be helpful to you as I'm not sure what you're doing with the shared memory. Also, with IPC::Shareable if you are attempting to lock the shared memory from the other processes, you'll have some problems as children inherit locks from their parents and that gets kinda messy. Check the results of an 'ipcs' on the system.. and clean up old semaphores and shared memory handles.

    General memory saving tips that I use, (feel free to beat me if this sounds stupid).
    1) Scope - never allow variables to exist where they dont' need to.
    2) If there's a potential to return 5 bazillion records or lines on a db query/file open, process the records line by line.. don't read them all into an array and THEN work on them one at a time.
    3) Pass references to sub routines, not copies. (this could be dangerous so WATCH WHAT YOU ARE DOING!)
    4) Buy more RAM, its like dirt cheap! ;)

    That's all I could think of off the top of my head.. if you provide a little more insight into what you're doing, maybe an example piece of code, we might be able to help more..


    -brad..
      I'm not really interested in utilizing shared memory, so much as I need to justify/clarify the amount of memory used by my processes.
      but if it can help me utilize less system resources overall, I might be interested.

      thanks :)

Re: collective unconcious (about shared memory...)
by dws (Chancellor) on Apr 17, 2001 at 03:48 UTC
    This doesn't answer your question, but...

    The customer might well care about resources, but RAM is cheap compared to the time it would take to recode what you've done into C (and then debug it). Unless the customer is deploying this on lots and lots of machines, or unless the cost of taking a server out of production to add RAM is prohibitive, recoding in C might not be such a win.

      you're right. But considering how many of these processes might be running simultaneously, I still want to optimize to some degree. I recognize the tradeoff(see original comment) but it is the question of shared memory that may tip the scale in Perl's favor. thanks!

      plus the client will be running this on every machine...