So, I'm modifiying a perl script I wrote for multi-threading. It takes about 30 minutes to run using 1-2% of the CPU on our machines, and with 10 threads doing the work at the same time it uses 10-20% of the CPU.... but it runs out of memory.

What the script did originally is read about 20000+ files, parse them for certain bits of information, and put that into a hash of array of hashes of arrays of hashes so that XML::Simple could output the relevant information into a neat 7.5MB xml file (later loaded into another script).

Now, I am basically trying make a thread for the parsing of each file, with a limit of 10 threads. I am running Perl 5.8.2, using threads and threads::shared. Here is a basic example of what I'm doing:

$hashone = &shared({}); $arrayone = &shared([]); $hashone->{"arraykey"} = $arrayone; for loop{ #some other stuff $hashtwo = &shared({}); push($arrayone, $hashtwo); $arraytwo = &shared([]); $hashtwo->{$filename} = $arraytwo; #add some other values to hashtwo if ($#threads >=9) { $thread = shift(@threads); $thread->join(); undef $thread; } push (@threads, threads->create(\&threadfunction, $filename, $arraytwo +); } # clean up all the other threads, finish as I would without threading sub threadfunction { ($filename, $arraytwo) = @_; open(PIPE, "outputprogram $filename |"); while (<PIPE>) { #add various $hasthree = &shared({}) to $arraytwo #add various values to $hashthree's } close PIPE; }

The undef $thread; I added due to another post I found saying that his threads werent giving up their memory otherwise, which didnt make any sense to him (nor to me). It did allow the program to run longer before running out of memory.

I tried sharing the pointers to the hashes/arrays outside of the thread. That did not help.

My only ideas were:
1. The array/hash/array etc. is taking up too much memory, but it wasnt before I added threads, so this makes no sense.
2. The pipes out of those files are taking up too much memory now that there's 10 of them. But each one is only 10kB of information so that's impossible.
3. Somehow, maybe due to the references not being shared, it was cloning the hash/array/hash etc. for each thread, which would get bigger in the first place and therefore bigger for each new thread as time went on. Except, if it were cloning them, then the original would not be getting much bigger at all. All the pointers are passed in by value anyway, and I have to assume all the values in a shared hash/array are shared (in fact, by the rules, I don't think I could add a non-shared anything to a shared hash/array). The only shared values I create in a thread are the arrays/hashes, and I point to those only in the main shared array/hashes in the parent thread.

I dont know if this problem is solvable, or if its something inherently wrong with perl threading and what I'm trying to do with it, but it would be nice to at least know why it's happening. Any ideas? Should I just leave it as a 30 minute process and code it in Java instead if I actually want it to work?


In reply to Threads in Perl: Just leaky? by TheShrike

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.