in reply to Parallel::ForkManager (high cpu and a lot of memory)

Which version of threads are you using? Upgrading to 0.71 seems to avoid memory leaks that the combination of 5.10 and some earlier versions (eg. 0.67) exhibited.

There is no way that you should using 100% cpu with 10 threads performing IO. This seems to be a problem with Parallel::ForkManager on 5.10. You can do pretty much exactly the same thing as above, but using threads, like this:

#! perl -slw use threads; use threads::shared; use LWP::UserAgent; use HTTP::Request; my $semStdout :shared; my $running :shared = 0; open(LIST,"urls.txt"); while ( my $tld = <LIST> ) { chomp $tld; Win32::Sleep( 100 ) while do{ lock $running; $running >= 10 }; async{ { lock $running; ++$running; } my $url = "http://$tld/"; my $ua = new LWP::UserAgent; $ua->timeout(5); $ua->agent("Mozilla/6.0"); my $req = HTTP::Request->new('GET',$url); my $res = $ua->request($req); my $content = $res->content; my $status = $content =~ /OK/i ? 'ack' : 'nak'; { lock $semStdout; printf "(%3d)$tld: %s\n", threads->self->tid, $status; } { lock $running; --$running; } }->detach; } close(LIST);

Memory usage seems to be stable and cpu usage < 10% for 10 threads.

There are better, lower resource intensive ways of using threads, but it does have the virtue of being very close to the P::FM way of operating which you might consider a bonus.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Parallel::ForkManager (high cpu and a lot of memory)
by salva (Canon) on Oct 08, 2008 at 12:03 UTC
    There is no way that you should using 100% cpu with 10 threads performing IO
    But it's not just IO, LWP::UserAgent loads and parses several Perl modules on demand.

    For instance, running the OP script (with a fixed set of URLs) under strace on my machine shows that every process reads...

    bytes Compress::Raw::Zlib Compress::Zlib Fcntl File::Glob File::GlobMapper File::Spec File::Spec::Unix HTML::Entities HTML::HeadParser HTML::Parser IO IO::Compress::Adapter::Deflate IO::Compress::Base IO::Compress::Base::Common IO::Compress::Gzip IO::Compress::Gzip::Constants IO::Compress::RawDeflate IO::Compress::Zlib::Extra IO::File IO::Handle IO::Seekable IO::Select IO::Socket IO::Socket::INET IO::Socket::UNIX IO::Uncompress::Adapter::Inflate IO::Uncompress::Base IO::Uncompress::Gunzip IO::Uncompress::RawInflate List::Util LWP::Protocol::http Net::HTTP Net::HTTP::Methods Scalar::Util SelectSaver Socket Symbol URI::_generic URI::http URI::_query URI::_server utf8

      Given that forks are threads on win32, and I see those same file access on my machine with my threaded code, there has to be something else going on that is consuming cpu, because the threaded version uses far less.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Thx for all the help so far!

        But I still have a question in my mind.
        If I would code thesame in c++ for example. Would the process use less memory and cpu?