Re^3: Some code optimization

So the "is_contained()" call is costing a lot -- and it's doing some unnecessary work:

sub is_contained ($$$) {
    my ( $some_legal_range, $gene_legal_range, $legal_length ) = @_;

    ### save result for later use:
    my $range_length = legal_range_length($some_legal_range)
        or return 0;

#    if ( legal_range_length($some_legal_range) == 0 ) {
#        return 0;
#    }

    my $intersection_legal_range =
      intersect_legal_ranges( $some_legal_range, $gene_legal_range,
        $legal_length );

    return ( legal_range_length($intersection_legal_range) == $range_l
+ength );

#    if ( legal_range_length($intersection_legal_range) ==
#        legal_range_length($some_legal_range) )  ### <- already did t
+his call
#    {
#        return 1;
#    }
#    return 0;
}

# ...

sub intersect_legal_ranges($$$) {
    my ( $legal_range_1, $legal_range_2, $legal_length ) = @_;

    my $intersections_a = [];

    my $intersect_h;  ### let's declare this in advance

    for ( my $i = 0 ; $i < scalar( @{$legal_range_1} ) ; $i++ ) {
        for ( my $j = 0 ; $j < scalar( @{$legal_range_2} ) ; $j++ ) {
            $intersect_h = intersect_simple_ranges( ${$legal_range_1}[
+$i],
                ${$legal_range_2}[$j] ) and     ### only push what you
+'ll use
                push @{$intersections_a}, $intersect_h;
        }
    }
    $intersections_a = flatten_simple_ranges( $intersections_a, $legal
+_length );
}

sub flatten_simple_ranges($$) {
    my ( $simple_range_a, $legal_length ) = @_;

    my $legal_range = [];
    my $last_to     = undef;

#    # filter out undef
#    $simple_range_a = [ grep { $_ } @{$simple_range_a} ];  ### <- not
+ needed
    ...
[download]

When I ran the original code on my macbook, reported a total loop time of 95 sec.

After making the few changes mentioned above (remove a redundant function call, don't push undef onto an array, remove the grep that filters out the undef items that aren't being pushed now), the total loop time dropped to 80 sec.

If I also move the substance of the "intersect_simple_ranges()" function into the one block where that function is called (i.e. eliminate that one function call), it drops to 78 sec. Changing the {FROM, TO} AoH to a (less legible) AoA trimmed it to 75.

Beyond that, I don't see anything obvious, but I haven't taken the time to grok the overall algorithm (let alone comprehend the ultimate goal). There might be easier ways to accomplish whatever you're doing here, e.g. by perhaps using a simpler data structure.

UPDATE: Something to consider would be whether there's a way to make sure the required sorting is handled somewhere other than the inner-most loop, if at all possible. That is, can the algorithm be made to work in such a way that the data is sorted once before going into a given loop, thereby making it unnecessary to do repeated sorts within the loop?

Comment on Re^3: Some code optimization Select or Download Code

Replies are listed 'Best First'.
Re^4: Some code optimization by roibrodo (Sexton) on Jun 18, 2010 at 09:24 UTC
Thank you, I will look into it. But I think the problem starts with the first problematic function (as the name suggests :)) - `gene_to_legal_range`. When I `next` just before it I run 6 times faster than if I `next` just after it (~4 seconds compared to ~27 seconds for 50 iterations). This subroutine really looks simple to me, so I don't see why it has to take so much time (subroutine overhead?!). p.s. I now run my home PC and the benchmarking results are a little different (but the relative differences are similar). UPDATE Almost always the sort is done on an array with a single element (there are cases where there are more). I even tried removing the sort at all, just to see the effect on running time - there is no change. But, again, I suggest we start with the first, more simple subroutine. Is it really just perl subroutine overhead? Is there anything I can do about it?	[reply] [d/l] [select]
Re^5: Some code optimization by graff (Chancellor) on Jun 18, 2010 at 09:46 UTC
I think the problem starts with the first problematic function... Now you aren't making sense. According to your initial description of timing relations: `65 sec for full code 15 sec for code MINUS "is_contained()" call (2nd "next" uncommented)` [download] That really makes it look like "is_contained()" accounts for most of the run time, which I would expect, since that is the part that involves several more layers of function calls, with lots of min, max, sort, grep, etc going on (some of it obviously not needed). "Look into it", for sure. To reiterate the question in my update: is there a way to move the sorting out of the innermost loop? E.g. might there be an appropriate "sort" on one or both of the lists that drive the two "foreach" loops in "scenario()", which would simplify operations inside the inner loop? Can anything be done as part of the outer foreach so that you don't have to do it repeatedly as part of the inner loop?	[reply] [d/l]
Re^6: Some code optimization by roibrodo (Sexton) on Jun 18, 2010 at 10:03 UTC
"is_contained()" indeed accounts for most of the run time, but "gene_to_legal_range()" is also much slower than I would expect. Updated benchmarks (tested again with the original posted code): - Both subroutines executed: ~64 seconds - Only "gene_to_legal_range()" is executed: ~15 seconds - Only "gene_to_legal_range()" is executed but its body replaced with just a `return`: ~6.1 seconds - none is executed: ~4.5 seconds So "gene_to_legal_range()", which is quite simple, triples the run time (15 vs 4.5). Also just calling this subroutine, even if it does nothing (just a `return`) adds some significant time (6.1 vs 4.5). In the final program I will have ~1000 iterations (instead of 50) so these differences do matter. To conclude, I agree that "is_contained()" consumes more time, but I suggest that we first assert that the first, much simpler function is "optimal" Since I'm new to perl optimization, this might also help me know what to expect, what is considered OK (e.g. such an overhead for a subroutine call?) and what is not. UPDATE: and as for sort, again, I removed it completely. Got ~60 seconds (instead of ~64). But I again suggest we first "clear" with the first, simpler sub.	[reply] [d/l] [select]
Re^7: Some code optimization by graff (Chancellor) on Jun 18, 2010 at 10:20 UTC
Re^8: Some code optimization by roibrodo (Sexton) on Jun 18, 2010 at 10:32 UTC
Some notes below your chosen depth have not been shown here