BioNrd has asked for the wisdom of the Perl Monks concerning the following question:

I am using a nested loop b/c I have external number of things that need to happen, then an internal number of things that corresponds directly to the external, then a super internal set of things that needs to correspond directly to the internal things.

i.e. MotherA -> ChildA -> genes within childA; MotherA -> ChildB -> genes within childB; MotherB -> ChildB -> genes within childB

I have a list of mothers and fathers that I am randomly combining by saying: 30 mothers (external) have "n" children (internal) from "n" fathers. The children then get n genes from father and n genes from mother (super internal).

I have a code that works exactly as I would like it to. However, the problem is when the number of mothers, and/or number of offspring per mother (think plants), and/or the number of genes I am tracking gets exceedingly large the run time of the script gets horrendously long. Here is a mock up of the code:

use strict; use warnings; my $num_of_external = 30; #num moms my @num_of_internal = (30, 40, 20, 30, 40, 30, 40, 20, 30, 40, 30, 40, 20, 30, 40, 30, 40, 20, 30, 40, 30, 40, 20, 30, 40, 30, 40, 20, 30, 40); #num offspring per mom my $num_of_super_internal = 10; #num genes tracked #during shuffle 5 come from dad 5 come from mom for (1 .. $num_of_external) { my $internal = shift @num_of_internal; #additional things here. #i.e. select mother from possible pool for (1 .. $internal) { #select father for each offspring #additional things here. for (1 .. $num_of_super_internal) { #additional things here. #select genes and shuffle around between mom and dad. } } }
Having gotten this far on my own, I am not sure how to avoid/do away with/restructure/alter/fix this nested loop structure. Is there any way to increase run time by changing the way I am thinking about loops?

Anything you monks can offer would be greatly appreciated by my overheating processor.

Bio

---- Even a blind squirrel finds a nut sometimes.

Replies are listed 'Best First'.
Re: Nested loops of doom
by dragonchild (Archbishop) on Apr 07, 2008 at 20:41 UTC
    So, you have code that, essentially, has a runtime of N^3 and you're wondering why its runtime slows down dramatically as N goes up? *blinks*

    A few thoughts:

    • This code looks to be extremely parallelizable. Maybe something like Parallel::ForkManager might help.
    • The number of "#additional things here" looks very suspicious. Maybe, you shouldn't do so much inside loops?
    • Essentially, you're attempting to get a large set of mother-father pairs, then do stuff with those pairs. This sounds very much like a directed cartesian join. That screams a relational database to me. They are very efficient at set generation.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Nested loops of doom
by samtregar (Abbot) on Apr 07, 2008 at 20:29 UTC
    If you want to speed up your code, you need to find out what's taking up the most time. The way to do that is to use a profiler, probably on a reduced-size run. You can try Devel::DProf, Devel::Profile or perhaps Devel::SmallProf.

    Once you know where the time is being spent you can work on finding a faster way to do it. You could post again with a more specific question, revealing perhaps what's actually done in the middle of your innermost loop.

    -sam

Re: Nested loops of doom
by BrowserUk (Patriarch) on Apr 07, 2008 at 22:54 UTC

    By far the best way to optimise, is to find a better algorithm. Basically, avoid doing stuff, or do it as few times as possible. Sometimes tackling a problem from a different viewpoint can make a very dramatic difference to run times.

    To suggest a better algorithm, we would need to know the details of what you are doing.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Nested loops of doom
by BioNrd (Monk) on Apr 08, 2008 at 14:08 UTC
    The tip to use the development tools paid off. I found a bit of code that was eating up the majority of the time in the routine that this was apart of. Thanks for the help
    ---- Even a blind squirrel finds a nut sometimes.