Hi All:

I ran into some unusual perl behavior today at work and was wondering if anyone could explain why it occurs.

I was attempting to load a queue with a reasonable number of strings (200,000), with which I needed to do some work. I'm a big fan of using map in an anonymous context, so I loaded the queue inside of two nested map statements. Although this appeared to load the queue successfully and do the work I intended it to do, the amount of memory it used was very large, approximately 10,000MB with 1 thread. I then rewrote the code to load the queue inside of two nested foreach loops and only 800MB was used.

Can anyone explain this behavior, I know it has to do with how the queue is loaded and not with the work being done to the queued strings, for the following code snippets have the same behavior on my machine. The input is a flat file composed of .mol files.

Nested Maps (uses 10,000MB)

use MolFile; use threads; use Thread::Queue; my $database_compounds = ( MolFile->new( "File" => shift @ARGV )->pars +e_noHydrogens() ); my @names = ( keys %$database_compounds ); sub doWork { return; } ( our $THREADS, my $Qwork, my $Qresults ) = ( 1, new Thread::Queue, ne +w Thread::Queue ); my @thread_pool = map { threads->create( \&doWork, $Qwork, $Qresults ) + } 1..$THREADS; #---- map { my $i = $_; map { $Qwork->enqueue("$names[$_]!"."$names[$i]!") i +f $$database_compounds{$names[$i]}->Formula eq $$database_compounds{$ +names[$_]}->Formula) ; } ($i+1..$#names); } (0..$#names); #---- $Qwork->enqueue( (undef) x $THREADS ); map {$_->join();} @thread_pool; for (1..$THREADS) { while ( my $result = $Qresults->dequeue ) { print $result , "\n"; } }

Nested Foreach (uses 800MB)

use MolFile; use threads; use Thread::Queue; my $database_compounds = ( MolFile->new( "File" => shift @ARGV )->pars +e_noHydrogens() ); my @names = ( keys %$database_compounds ); sub doWork { return; } ( our $THREADS, my $Qwork, my $Qresults ) = ( 1, new Thread::Queue, ne +w Thread::Queue ); my @thread_pool = map { threads->create( \&doWork, $Qwork, $Qresults ) + } 1..$THREADS; #---- foreach my $i (0..$#names) { foreach my $j ($i+1..$#names) { if ( $$database_compounds{$names[$i]}->Formula eq $$database_compo +unds{$names[$j]}->Formula ) { my $string = "$names[$i]"."!"."$names[$j]"; $Qwork->enqueue($string); } } } #---- $Qwork->enqueue( (undef) x $THREADS ); map {$_->join();} @thread_pool; for (1..$THREADS) { while ( my $result = $Qresults->dequeue ) { print $result , "\n"; } }

These are identical to the program I was working with except the doWork subroutine is replaced with something to actually do work. I don't mind the foreach version, I would like to know why the nested maps produces the behavior?

Any ideas would be appreciated.

Thanks <\p>


In reply to Thread::Queue memory issue with nested maps but not foreach loops... by jmmitc06

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.