Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Multithreaded memory usage

by bingohighway (Acolyte)
on Apr 21, 2009 at 07:54 UTC ( [id://758897]=perlquestion: print w/replies, xml ) Need Help??

bingohighway has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I am currently writing a program that pulls some info off a web page, processes it, then whacks it into a database. It has to be fairly punctual and run every 5 seconds. As this process can sometimes take a few seconds, sometimes more than 5, I have decided to set each 5 second read as a new thread.

I am currently experiencing a problem of once a thread has finished it isn't releasing the memory, and the memory usage gradually increases as more threads are created. I have tried detaching the thread with a return() at the end of it, no joy. Exit() just kills the parent process. I have also tried waiting for say 20 seconds and killing the thread externally, e.g. $thread->exit().

I am currently using the threads module.

Any ideas?

Cheers!

Replies are listed 'Best First'.
Re: Multithreaded memory usage
by moritz (Cardinal) on Apr 21, 2009 at 08:01 UTC
    Any ideas?

    Two ideas actually. The first is to search for memory leaks with something like Devel::Cycle or Devel::Leak - maybe you just have a circular reference somewhere that prevents perl from destroying the objects.

    The second idea is to spawn a new threadprocess for each task, and let the operating system clean up the memory for you.

      Ok, I'll try that. I am assuming once a detached thread reaches the end of its code block it should automatically quit and the OS should recover that memory? (assuming it is coded correctly :-) )

      Cheers

        I am assuming once a detached thread reaches the end of its code block it should automatically quit and the OS should recover that memory?

        I'm really not an expert here, but I don't think that's the case, at least not always. The OS probably doesn't have any idea about which parts of the memory is associated to which user-level thread, and thus can't clean up. It does know about processes though, which is why I recommended them over threads.

      The second idea is to spawn a new thread for each task

      Did you mean process rather than thread?

      update: p.s. hurrah for physicists (especially those forgetting everything they learnt by doing a completely unrelated job) :)

      ........
      Those are my principles. If you don't like them I have others.
      -- Groucho Marx
      .......
        yes, that's what I meant. Sorry for the confusion.
      Okay,

      I have tried the for method and after about 60 forks (over 5 mins) the parent process can't fork any more processes. I have tried setting $SIG{CHLD} = 'IGNORE' to ignore the zombified children, but to no avail.

      Any ideas where to go to next?

      Cheers!

        On Windows, start the subprocess not via fork() but better via system(1, @args) or via system("start @args");. That way, it is dissociated from the main Perl program.

        wait (or waitpid) for them (perhaps in $SIG{CHLD} to be nearly non-blocking) so that they don't become zombies? And what's the error message?
Re: Multithreaded memory usage
by BrowserUk (Patriarch) on Apr 21, 2009 at 15:48 UTC

    Forget using Perl's Windows fork emulation. It leaks like a sieve.

    See how you get on with this. On my system it settles down to a steady state memory usage after a few cycles:

    #! perl -sw use 5.010; use strict; use threads ( stack_size => 4096 ); while( 1 ) { async( \&get_data_and_go )->detach; sleep 5; ## Go get every 5 seconds } sub get_data_and_go { my $tid = threads->tid; require LWP::Simple; ## Requiring prevent CLONE leaks. ##does some webpage stuff and exits my $bytes= length( LWP::Simple::get( 'http://www.yahoo.com' ) ); say "$tid : Got $bytes"; sleep rand 10; ## Simulated variable processing time say "$tid : done"; return 0; };

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Superb :-) Did the trick. No memory leaks and seems to be less CPU intensive.

      Cheers!

Re: Multithreaded memory usage
by ikegami (Patriarch) on Apr 21, 2009 at 14:03 UTC
    Try explicitly undefining the variables that hold the thread reference. You might have a circular loop caused by a function captured the thread variable.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://758897]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2024-04-24 09:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found