on a qrsh ~18G it takes about 20 minutes. When I can't get a node with that kind of memory its a few hours.

Figures. When you fail to get a node with sufficient memory you are moving into swapping, and that will always kill performance big time. You need to avoid that at all costs.

When I run your code here, the fan out varies widely depending upon the random patterns:

c:\test>912999 I:1 L:1 I:2 L:14 I:3 L:34 I:4 L:182 I:5 L:977 I:6 L:6578 I:7 L:58659 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:13 I:3 L:142 I:4 L:784 I:5 L:6166 I:6 L:49483 I:7 L:299369 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:1 I:3 L:2 I:4 L:13 I:5 L:95 I:6 L:138 I:7 L:624 I:8 L:2923 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:2 I:3 L:5 I:4 L:29 I:5 L:53 I:6 L:294 I:7 L:1935 Terminating on signal SIGINT(2) c:\test>912999 I:1 L:1 I:2 L:2 I:3 L:14 I:4 L:57 I:5 L:157 I:6 L:1467 I:7 L:3200 I:8 L:23871 I:9 L:81714 Terminating on signal SIGINT(2)

I'm going to assume that qrsh and qsub are GRID apis?

Whilst there is much that can be done to improve the performance of your posted code, given these statistics, it seems likely that the main constraint for your program is memory usage. When your program moves into swapping, any titivations done to save a few microseconds here and there will just get drowned in the noise of disk(memory) thrashing.

My suggestion would be to modify your script to monitor the size of the %allgen hash and when it reaches a size that is likely to push the minimum size node on your GRID into swapping, split the generations of that hash into (say) four files and qsub four nodes to read those files and pick up the algorithm from that point.

So, (say) you run 20 iterations and generate 1 million mutations. You split those 1 million into 4 files and start four nodes to pick up from that point with 1/4 million candidates. When each of those nodes approaches 1 million mutations, you repeat the split. And so on.

You'll need to judge the split points in the light of your knowledge of the systems available to you. On my (currently only 2GB) system, I've never managed to run your code past 10 iterations before the process moved into swapping.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^7: Increasing the efficiency of a viral clonal expansion model by BrowserUk
in thread Increasing the efficiency of a viral clonal expansion model by ZWcarp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.