in reply to Re^12: Strange memory leak using just threads (forks.pm)
in thread Strange memory leak using just threads

an up-front single request for as many pages of VM as are required followed by a single ring 3 rep mov.

I don't think so. Each variable should be copied separately, and you have to fix the references. So it's much more than single memory allocation.

Yes. That is an annoying detail of the ithreads implementation. But, it is quite easy to avoid; you just spawn your workers early, and have them require rather than use what they (individually) need.

Sorry, don't see how this can help. Workers usually all need the same set of modules.

  • Comment on Re^13: Strange memory leak using just threads (forks.pm)

Replies are listed 'Best First'.
Re^14: Strange memory leak using just threads (forks.pm)
by BrowserUk (Patriarch) on Sep 22, 2010 at 16:24 UTC
    Sorry, don't see how this can help. Workers usually all need the same set of modules.

    The point of spawning early is to avoid there being much already in memory to cause duplication. Obviously, there's no point in require instead of use if all your threads need everything. (Hence (individually).)

    But, in many scenarios, whilst the threads require the same set of modules as each other, they don't need everything the main thread needs.

    Hence, for example, in a Tk app that uses some background threads for long running calculations or fetching stuff from the web etc., it makes sense to use the modules need by the workers, spawn the workers; then require Tk and anything else needed by the main thread. That way, the huge Tk doesn;t get needlessly replicated into all the workers.

    Similarly, if a threaded app need to use DBI, it make sense to spawn a DBI thread that requires DBI internally, and serialise DBI requests through it. It avoids duplicating DBI in all the apps other threads; and avoid complications with a DBs or DB libraries that use PIDs (rather than TIDs) for managing their internal memory.

    Another example is a threaded app that processes a large volume of work items read in from a file. Spawn the work threads before reading the file, otherwise the data structure holding the file contents gets replicated into all the threads even though the don't use it directly.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      The point of spawning early is to avoid there being much already in memory to cause duplication. Obviously, there's no point in require instead of use if all your threads need everything. (Hence (individually).)

      So I can use this optimisation if I need run a single specific thread, but if I need to start a bunch of identical worker threads it won't work.

        No. It is N times as effective for N identical threads as it is for 1.