Now it seems the worst cause of bloat are the modules copied for each thread. I have minimized the number of use'd modules, but they still amount to about 5 MB per thread. With 200 threads this uses half the available 32-bit address space...

5MB per kernel thread is hardly huge for an interpreted language.

You made it clear in your OP that you don't want to hear this which is why I haven't responded directly earlier, but your memory problems are a direct result of the bad architecture of your code.

Simply stated, there is no point in running 200 concurrent threads. You mention LWP, so I infer from that your program is doing some sort of downloading. If you have a 100mb/s connection, by running 200 concurrent threads, you've effectively given each thread the equivalent of a dial-up connection. So, in addition to chewing up more memory, throwing threads at your download task simply means that each thread takes longer and longer to complete its individual download.

In addition to your listed alternative possibilities of LWP::Parallel and HTTP::Async, there is another. That of using a sensible number of threads in a thread pool arrangement. Which for this type of IO-bound processing usually works out to be somewhere in the order of 2x to 5x times the number of cores your box has. For today's typical hardware that means 8 to 20 will give you the best throughput.

Using your own numbers, that will need < 100MB of memory, leaving at least 1.9GB available for your data.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^2: Sharing large data structures between threads by BrowserUk
in thread Sharing large data structures between threads by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.