The most likely reason is that you are never joining your threads. Just loading the front page at CNN.com started 31 threads. These threads then terminate, but are never cleaned up. Each thread is somewhat over 1 MB in size, and the result is that having loaded cnn.com, the memory use has grown to close to 50 MB in total. A couple of refreshes and this will force swapping and resource exhausion.

Your also passing $req and $host, lexical scalars, to each thread. Whilst the scalar is being shared automatically for you, and you are using it read-only, so the lack of locking is probably ok. Each time you share a variable, it is shared with every thread. That means that every thread created (including those that are dormant but unjoined is getting a copy of every request object added to it memory space.

As a first pass at fixing this, you should detach your threads once you've spawned them so that the die a natural death and undef $req & $host before the thread terminates.

... } else { threads->create( \&process_one_req, $browser, $req, $host )->detach; } ... sub process_one_req { my ($browser, $req, $host) = @_; my $remote = new IO::Socket::INET( Proto => "tcp", PeerAddr => $host, PeerPort => 80 ); if ($remote) { print $remote $req; my $chunk; print $browser $chunk while (sysread($remote, $chunk, 10000)); close($remote); undef($remote); } else { print $browser RES_400; } close($browser); undef($req); undef($host); undef($browser); }

Making these changes, I can load and reload the cnn frontpage and whilst the memory use grows to around 8 MB at the peak, it rapidly falls back to aroud 5 MB as the requests complete and the threads die. This seems to cure the continuous memory growth completely and may effect a cure for your transient core dumps.

I also noticed that if the request is a POST rather than a GET, then your regex to extract the page name fails and results in

Use of uninitialized value in concatenation (.) or string at P:\test\p +roxy.pl8 line 42. Received request for [perlmonks.com, ]

HTH.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!


In reply to Re: another core dump by BrowserUk
in thread another core dump by pg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.