in reply to Re^2: Some code optimization
in thread Some code optimization

In your initial question you stated, "The function itself is quite simple - it gets 3 scalars (one of which is a field of some object), checks a couple of if's on the values and some basic math (including modulo %), and returns an array with a couple of hashes, each with two numerical fields. That's it." But then the code you presented here shows a quagmire of complexity of loops within loops, greps within loops (which is another form of loop), sorting within loops (which implies more loops), and so on. Having first read your initial question (with no code posted), and then later reading the code you posted, I couldn't believe my eyes. My first thought was, "This must be a practical joke. We're being suckered." What you're calling "checking a couple of ifs and some basic math" is actually nested loops with sorting and greping inside of them. It couldn't get much worse from an efficiency standpoint than that.

Let's imagine a really simple case in this code fragment:

my $n = 100; my $c = 0; foreach (0 .. $n) { $c++; } print "$c\n";

That loop executes in O(n) time. Now consider this loop:

my $n = 100; my $c = 0; foreach my $outter (0 .. $n) { foreach my $inner (0 .. $n) { $c++; } } print "$c\n";

That chunk of code executes in O(n^2) time. That means that for 'n' there are n^2 iterations. If your code stopped there, you would still be questioning why it's taking so long. But it doesn't. I didn't wander through every level of looping, but I think your efficiency is going quadratic, which is to say, inefficient. I don't know how you could equate multiple layers of nested loops with "The function itself is quite simple..." These are mutually exclusive conditions.

It could be there is a more efficient algorithm out there to solve the problem you're tackling. OR, it could be that you have one of those problems for which there is no efficient solution. If that's the case, thank goodness you've got a computer to do it for you. ;)


Dave

Replies are listed 'Best First'.
Re^4: Some code optimization
by roibrodo (Sexton) on Jun 18, 2010 at 08:32 UTC

    Let's look at a single iteration and define n as the number of genes. A gene is a start point + length, i.e. n is the number of elements pushed in the creation of $simulation_h (i.e. n=70,000 in the example).

    The main loop is indeed a nested loop, but it simply gets a single gene at a time from the hash, hence this double loop itself takes O(n) (before doing anything).

    Now, let's look at what is done inside the loop. We call two "problematic" subroutines, which in turn, call some other subroutines. The "loops" you see in the other subroutines are used to deal with the case when genes are split (start too close to the "end" of a circular the chromosome, thus have to be represented by two ranges instead of one). A range can therefore be either a singleton (one array element) or "split" (two array elements).

    So, yes, there are loops, but they sure don't add another factor of n to the complexity, since these loops are bounded by a small constant -- each loop is iterated for maximum of 2 iterations -- so even the double loop that checks a pair of ranges against each other (each range is "split" in the worse case scenario) is iterated for maximum of 4 iterations.

    To conclude, the complexity is actually O(n) per iteration (again, n is the number of genes).

      So if the outer loop, 'n' is the only 'big loop', try putting a broadly scoped counter inside each of the 'little loops' (a unique counter for each), and prove to yourself that they're not running away by mistake. ...just one idea. Another is to check how many times you're calling that grep and the sort, and for how many elements. Each of those is an implicit loop.

      The idea is to test your theory that the complexity is O(n), and in testing, you may discover that something is not what it seems. I'm not suggesting that your theory is wrong, but that maybe the implementation doesn't match the theory.

      If I were hunting down the problem myself, I would be very interested in how many iterations each of the loops is running through. Using a counter variable for each loop, and scoping it to a broad scope so it doesn't reset, and then checking it at the end of the run, may smoke out the mole.


      Dave

        I Added a global counter in the most inner loop in intersect_legal_ranges. Counter value at end is, as expected, equal to the number of iterations * n + some small overhead for the ranges that were created as "split" by random. E.g. for 10 iterations and 70,000 genes I got 702,810 (i.e. 70,281 per iteration when n=70,000).

        Also note that the slowdown occurs even when the second problematic subroutine (is_contained) is not called at all (the next before it on). The first problematic subroutine (gene_to_legal_range) does not use loops at all and is very simple (I hope you agree)