msalerno has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that uses forks. Each child calls multiple subs. Of course those subs exist in MAIN and get copied by the fork operation, but is it necessary to have them in MAIN? Is there a way to have the subs exist only within the child processes? For the sake of code management and for some callback operations, I am not using anonymous subs.

Replies are listed 'Best First'.
Re: Fork and multiple subs
by Illuminatus (Curate) on Feb 16, 2011 at 20:41 UTC
    I am curious as to why it matters? The actual sub code takes almost no memory, and will use no additional memory if the subs are not actually called. If you have *alot* of subs, and are worried about fork performance, that's not an issue. A fork makes a copy of the data portion of the process, but uses the same code portion, so the number of subs won't affect fork performance

    fnord

      A fork makes a copy of the data portion of the process, but uses the same code portion, so the number of subs won't affect fork performance

      fork doesn't copy anything. Individual memory pages get copied the first time they are changed after a fork. But maybe you were talking abstractly and meant they wouldn't eventually get copied. I don't think that's true either.

      Specifically, I'm pretty sure some fields of some ops change during execution (Upd: I'm not so sure anymore ), and I don't think there's any attempt to keep the ops in a separate memory page from variables that may change. Even if ops were sequestered into their own memory page, loading a module or otherwise compiling code after the fork would change the page.

      Update: Cleaned up phrasing. Added last sentence.

        fork doesn't copy anything
        Even though I have been working with Linux exclusively for the last 3 years, I still 'default' to Solaris in my *nix response :(. Linux only initially copies the page tables (very small). Solaris, on the other hand, does copy all of the address space (unless you use vfork). In both environments, text (ie code) pages are read-only. I am 99% sure that in Solaris, text pages are only ever put into physical memory once (which includes the text portion of shared libs once they are loaded the first time), and referenced by all processes using them. For Linux, it would not matter in this case, since pages are copy-on-write, and text pages can never be written to (hence, they would never be copied).

        fnord

      I agree, it probably doesn't matter. However, most of the modules I am loading are being used only by the children. I know that with threads, you have a little more control over which modules get loaded in the parent.
Re: Fork and multiple subs
by fidesachates (Monk) on Feb 16, 2011 at 21:45 UTC
    I think the answer you're looking for is to break up the use. If I've read correctly(please let me know if I misunderstood the problem), you have subs in a module that is loaded and you want to delay that for the children to do rather than have the parent also load the module.

    Your solution, I believe is documented here. link removed I could be totally wrong, but I figured I'd let you decide for yourself.
Re: Fork and multiple subs
by cdarke (Prior) on Feb 17, 2011 at 08:29 UTC
    If your child process code is that different from the parent, would it not be better to have two different programs (parent and child) and exec after the fork?
Re: Fork and multiple subs
by anonymized user 468275 (Curate) on Feb 16, 2011 at 21:52 UTC
    first idea in head:
    unless ( $pid = fork ) { require 'childlib.pl'; ... };

    One world, one people

Re: Fork and multiple subs
by karavelov (Monk) on Feb 18, 2011 at 02:01 UTC

    As suggested, you have 2 options:

    * 'require' modules dynamically in the child process
    * 'exec' completely different program

    In either case the performance hit of recompiling the code (required module/executed code) after each fork will be huge compared to just loading everything in the parent and forking afterwards