in reply to Fork and multiple subs

I am curious as to why it matters? The actual sub code takes almost no memory, and will use no additional memory if the subs are not actually called. If you have *alot* of subs, and are worried about fork performance, that's not an issue. A fork makes a copy of the data portion of the process, but uses the same code portion, so the number of subs won't affect fork performance

fnord

Replies are listed 'Best First'.
Re^2: Fork and multiple subs
by ikegami (Patriarch) on Feb 16, 2011 at 22:38 UTC

    A fork makes a copy of the data portion of the process, but uses the same code portion, so the number of subs won't affect fork performance

    fork doesn't copy anything. Individual memory pages get copied the first time they are changed after a fork. But maybe you were talking abstractly and meant they wouldn't eventually get copied. I don't think that's true either.

    Specifically, I'm pretty sure some fields of some ops change during execution (Upd: I'm not so sure anymore ), and I don't think there's any attempt to keep the ops in a separate memory page from variables that may change. Even if ops were sequestered into their own memory page, loading a module or otherwise compiling code after the fork would change the page.

    Update: Cleaned up phrasing. Added last sentence.

      fork doesn't copy anything
      Even though I have been working with Linux exclusively for the last 3 years, I still 'default' to Solaris in my *nix response :(. Linux only initially copies the page tables (very small). Solaris, on the other hand, does copy all of the address space (unless you use vfork). In both environments, text (ie code) pages are read-only. I am 99% sure that in Solaris, text pages are only ever put into physical memory once (which includes the text portion of shared libs once they are loaded the first time), and referenced by all processes using them. For Linux, it would not matter in this case, since pages are copy-on-write, and text pages can never be written to (hence, they would never be copied).

      fnord

        I think your argumentation is overlooking the fact that Perl code doesn't equal compiled machine code.

        What the OS treats as text pages (code) is what is marked as such in the respective binary or shared library that's being loaded.  With respect to Perl, this applies only to the compiled machine code that makes up the interpreter itself, not the "byte code"-like instructions that Perl code is compiled into at runtime (i.e. after the Perl binary has been loaded/mapped).

        In other words, from the perspective of the OS, the compiled Perl opcodes are considered "data" (located on the heap) of the Perl executable. And you'd have to have rather good knowledge of the Perl internals to tell whether some bits in those data structures might possibly change (or not) as a result of running the code, which in turn would trigger copy-on-write...

        Linux only initially copies the page tables (very small). Solaris, on the other hand, does copy all of the address space (unless you use vfork).

        Thanks for the correction. I didn't know that some systems (i.e. Solaris) didn't use copy-on-write.

        In both environments, text (ie code) pages are read-only.

        And thus they can't possibly be used. Perl wouldn't be able to populate it if it started read-only. Additionally, it can't become read-only (if that's possible) because Perl can compile more code at any time (as mentioned earlier).

Re^2: Fork and multiple subs
by msalerno (Beadle) on Feb 16, 2011 at 21:16 UTC
    I agree, it probably doesn't matter. However, most of the modules I am loading are being used only by the children. I know that with threads, you have a little more control over which modules get loaded in the parent.