in reply to Re: release threads resources?
in thread release threads resources?

I have thought about and tried the detach method also since the return of each thread is not important but the memory issues still exist with the detach method as they did with the join method.

Replies are listed 'Best First'.
Re^3: release threads resources?
by BrowserUk (Patriarch) on Oct 12, 2005 at 02:56 UTC

    As stated above, thread memory is returned to the process when the thread terminates, not the OS. The memory becomes a part of the free memory pool(s) that new program elements, both code and data will be allocated from as needed.

    What that means, is that if any of your existing threads allocate scalars, or arrays or instantiate new instance of classes etc. etc., then the memory from terminated threads is recycled to provide it. If you start a new thread, then that memory will be used to provide for that new thread.

    At the very extreme, if your threaded perl process terminated threads and never needed to allocate another bean of memory, and other processes on your system continued to call for more and more memory until the only "free memory" in the system was that freed from the terminated threads within your process, then the OS would swap that free memory to disk, and the other process would be allocated what it needed.

    That called swapping--and everyone knows swapping is bad right?-- but the clever bit is that if your process never attempts to re-use that swapped out free (virtual) memory, then it will just stay on disk and so it's equivalent of real memory will be available to any and every other process in the system anyway. Swapping is only bad if it happens to memory that is in constant use and results in thrashing.

    So, what is the "memory issue" you have with Perl's threads?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
      The issue was my application, before I reworte it, was chewing up about 80% of a gig of DDR memory. It is an app that connects to over 500 servers and for each server the app creates 1 thread and three socket connections (1 to the server and 2 to MySQL database). Each thread also creates pretty extensive hash tables (sometimes well over a few thousand keys). I am aware of perl's hash table memory hogging but the constant-time O(1) lookup is needed for speed of execution.

      I also run a front end for this app via a webpage that I host on my apache web server on the same machine so I was worried about the resources left over to handle any http queries. In the end I rewrote the app to implement some load balancing and had the app respawn itself after ((scalar(@servers)/5)+1) and then start from where it left off. At first I tried system() but this of course did not work due to the blocking nature of system wating for a return so I used exec() to overwrite the current pid and dump the resources back to the OS. Seems to be working out nicely and each iteration only uses about 20% of the memory now.

        An interesting application and it sounds like you have a working solution.

        I assume that you are connecting a subset of the 500 servers at any given time?

        I've generally found it better to have a few long running threads rather than a lots of short running ones.

        Basically, each thread is a loop that (in your case) would connect to a server, do what ot need to, disconnect then loop back and connect to the next server. To ensure each thread does as much of the work as it is capable of, you load/feed a shared queue with the server information, and each time around the loop, each thread pick off the next server to be connected. Any results can be fed back to the main thread one or more return queues.

        The nice things about this arrangement are:

      • It scales nicely. Once you have it working (slowly) with a single comms thread, you just start as many more identical threads as your system and bandwidth can handle.
      • It minimises any memory leaks that might occur as each thread persists until the work is done. With the start a new thread for each server approach, you multiply any leaks by the number of servers.

        The downside for your application is that any large data tables you need within your comms threads will be replicated per thread, but it would appear you are doing that anyway, and by only replicating for a few, persistant threads instead of every time you start a new one, you will save time.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.