in reply to Help tightening up a subroutine please

Hmm. Do you remember this?

There is a possible caveat with this change! Your explicit loop does allow you to short circuit the loop at the high pass limit which isn't possible with grep. However, as shown above, using grep does allow you to avoid the creation of and later copying of the temporary array. This combined with greps inherent better performance may outweigh the benefit of that short circuiting. Or it may not.

The only way to tell will be to make the change and benchmark. If the size of the array being filtered is sufficiently large, and the pass band sufficiently narrow and low down in the range that the short circuit is beneficial, then it would be better to use a binary search algorithm to discover the low and high range limits and then copy the values across to the results set using an array slice.

Try substituting this for the grep.

my $aref = $matches{ $fasta_id }{ $sitekey }; my( $lo, $hi ) = ( 0, scalar @{ $aref } ); ++$lo while $aref->[ $lo + 1 ] < $lowerlimit; --$hi while $aref->[ $hi - 1 ] > $upperlimit; @{ $sets{ $fasta_id }[ $setscounter ]{ $sitekey } } = @{ $aref }[ lo +.. $hi ];

If that improves your performance, then using a binary search (as mentioned before) to find the lower and upper bounds should improve it further. Though binary searches coded in Perl aren't always as effective as you might hope.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: Help tightening up a subroutine please
by mdunnbass (Monk) on Jan 23, 2007 at 21:13 UTC
    So, I have implemented this, with the following minor change (to account for boundary conditions):
    my $aref = $matches{ $fasta_id }{ $sitekey }; my( $lo, $hi ) = ( 0, scalar @{ $aref } ); ++$lo while ($aref->[ $lo + 1 ] < $lowerlimit && $lo < $hi-2); --$hi while ($aref->[ $hi - 1 ] > $upperlimit && $hi > 0 );

    I would love to post the timing results from dprofpp, but after 25 minutes, it is still running. And, since I have a print statement in there for just this purpose, I know that it is not hung up, the program is still running as intended. So, the binary selector in this case is even slower than the first 2 alternatives.

    Back to the drawing board, I guess. I'd really love to make this sub run much much faster.

    Hey, here's a random thought.... Is there a way to quickly zoom in to a location in an array that your variable's value is near? Like:

    @a = ( 1 .. 100 ); $val = 43.56; $nearby = zoom($val,@a); # $nearby now is essentially an index for @a somewhere in the near # vicinity of 43.56, say 40 or so..
    That would be great.... Oh well.

    Thanks,
    Matt

      You could try profiling with this. It does a bastardized binary chop on both ends to locate the passband and then steps back to locate the limits:

      sub bandpass { my( $aref, $loValue, $hiValue ) = @_; return if $loValue > $aref->[-1] or $hiValue < $aref->[0]; my( $lo, $hi ) = ( 1, $#{ $aref } ); $lo += $lo while $aref->[$lo] < $loValue; --$lo while $lo and $aref->[$lo-1] >= $loValue; $hi >>= 1 while $aref->[$hi] > $hiValue; ++$hi while $hi < $#$aref and $aref->[$hi+1] <= $hiValue; return @{ $aref }[ $lo .. $hi ]; } ... my $aref = $matches{ $fasta_id }{ $sitekey }; $sets{ $fasta_id }[ $setscounter ]{ $sitekey } = [ bandpass( $aref, $lowerlimit, $upperlimit ) ];

      The bastardization is designed to handle non-unique and/missing values in the set and still locate the appropriate boundaries. If you know your values will always be unique that could simplify things a little. If you knew that the limits ($lowerlimit & $upperlimit) would always be contained within the set, that would simplify things considerably, but from what you've said elsewhere that is not the case.


      Update: Further testing with more realistic test data showed the above to be both inefficient and flawed.

      The problem with optimising this is the range of possible variations. Optimising for the general case is likely to do very badly for particular cases. Your original solution works well for narrow bands that are low in the range:

      min---------loPass--hiPass------------------------------------max

      But will perform badly for either of the following, where grep will perform quite well:

      min-------------------------------------------loPass--hiPass--max min--loPass-------------------------------------------hiPass--max

      For the latter of the these two, scanning in from either end will be optimal as with my crude linear solution, but from what you've said about the datasets, it seems that your pass bands are quite narrow, but can be located anywhere within the entire range. And that is the hardest scenario to optimise for.

      Here is another version that I believe to be both correct; and tailored towards being efficient for the type of data you've indicated you are working with. The revelation came to me that instead of using two binary chops to locate the two ends of the pass band--which is quite expensive as your limits may not appear in the dataset--I could use a single binary chop to locate a (single) value that lies within the passband and then use two linear searches out from that value to locate the lower and upper limits. If your pass bands are narrow, these linear searches will have little to do. They will also neatly handle the possibility of duplicate values which complicates binary chops.

      sub bandpass { my( $aref, $loValue, $hiValue ) = @_; return if $loValue > $aref->[-1] or $hiValue < $aref->[0]; my( $lo, $hi ) = ( 0, $#{ $aref } ); while( $lo < $hi ) { my $mid = int( ( $lo + $hi ) / 2 ); if( $aref->[ $mid ] >= $loValue and $aref->[ $mid ] <= $hiValu +e ) { $lo = $hi = $mid; last; } elsif( $aref->[ $mid ] < $loValue ) { $lo = $mid + 1; } elsif( $aref->[ $mid ] > $hiValue ) { $hi = $mid - 1; } } return if $loValue > $aref->[ $hi ] or $hiValue < $aref->[ $lo ]; --$lo while $lo and $aref->[ $lo - 1 ] >= $loValue; ++$hi while $hi < $#{ $aref } and $aref->[ $hi + 1 ] <= $hiValue; return @{ $aref }[ $lo .. $hi ]; }

      Again, no guarentees. It's very hard to do this kind of optimisation without real datasets to benchmark against.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Wow. Just, Wow. The time spent performing the WEED task went from almost 500 seconds down to 33 seconds. Fantastic!

        <b>Urodele 17:47:27</b> ~/Desktop> perl -w weedtest Benchmark: timing 100000 iterations of mybinary, singlepass, useagrep. +.. mybinary: 1 wallclock secs ( 0.58 usr + 0.01 sys = 0.59 CPU) @ 16 +9491.53/s (n=100000) singlepass: 1 wallclock secs ( 0.58 usr + 0.00 sys = 0.58 CPU) @ 17 +2413.79/s (n=100000) useagrep: 270 wallclock secs (266.95 usr + 0.63 sys = 267.58 CPU) @ + 373.72/s (n=100000) Rate useagrep mybinary singlepass useagrep 374/s -- -100% -100% mybinary 169492/s 45253% -- -2% singlepass 172414/s 46034% 2% -- <b>Urodele 17:51:59</b> ~/Desktop> dprofpp Total Elapsed Time = 361.1940 Seconds User+System Time = 306.4840 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 80.7 247.5 247.52 22012 0.0112 0.0112 main::SEARCHFASTA 7.78 23.84 23.849 127206 0.0000 0.0000 main::BANDPASS 4.45 13.64 13.649 164 0.0832 0.0832 main::GET_TEXT 3.07 9.399 33.248 1 9.3990 33.248 main::WEED 2.30 7.060 7.060 1 7.0600 7.0600 main::INDEX_FASTA 0.82 2.519 2.519 164 0.0154 0.0154 main::ADD_SPAN 0.60 1.829 4.398 1 1.8292 4.3983 main::HTML_FORMAT 0.36 1.110 1.110 1 1.1100 1.1100 main::WEEDNUM 0.20 0.610 308.51 2 0.3049 154.25 main::MAIN 0.12 0.380 0.380 1 0.3800 0.3800 main::OVERLAP 0.11 0.340 0.340 1 0.3400 0.3400 main::CLUSTER 0.05 0.140 0.140 2 0.0700 0.0700 main::GET_ENDS 0.02 0.050 0.050 1 0.0500 0.0500 main::TABLE_IT 0.01 0.030 0.410 1 0.0300 0.4100 main::SORT_HITS 0.01 0.020 0.020 2 0.0100 0.0100 main::WEED_HEADERS <b>Urodele 17:54:51</b> ~/Desktop>

        Just as a reminder, the dprof output of my original program was:

        Total Elapsed Time = 1135.701 Seconds User+System Time = 770.4718 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 63.8 491.8 491.86 1 491.86 491.86 main::WEED 32.5 251.1 251.10 22012 0.0114 0.0114 main::SEARCHFASTA 2.10 16.20 16.209 164 0.0988 0.0988 main::GET_TEXT 0.71 5.460 5.460 1 5.4600 5.4600 main::INDEX_FASTA 0.40 3.089 3.089 164 0.0188 0.0188 main::ADD_SPAN 0.15 1.140 1.140 1 1.1400 1.1400 main::WEEDNUM 0.08 0.599 770.48 2 0.2994 385.24 main::MAIN 0.05 0.380 0.380 1 0.3800 0.3800 main::OVERLAP 0.05 0.350 0.350 1 0.3500 0.3500 main::CLUSTER 0.02 0.130 0.130 2 0.0650 0.0650 main::GET_ENDS 0.01 0.048 3.176 1 0.0482 3.1764 main::HTML_FORMAT 0.01 0.040 0.040 1 0.0400 0.0400 main::TABLE_IT 0.01 0.040 0.420 1 0.0400 0.4200 main::SORT_HITS 0.00 0.020 0.020 2 0.0100 0.0100 main::WEED_HEADERS 0.00 0.010 0.010 1 0.0100 0.0100 warnings::BEGIN

        Thanks!!!

      I would choose Benchmark to do the comparison and go to bed if I were you...

      -Paul

        Actually, that wasn't a bad idea. Using the hypothetical data set from my response to Eric below, I benchmarked the 2 subs initially posted in this thread. At 50k iterations, here's what I got:

        Benchmark: timing 50000 iterations of mybinary, use_grep... mybinary: 15 wallclock secs (15.23 usr + 0.03 sys = 15.26 CPU) @ 32 +76.54/s (n=50000) use_grep: 13 wallclock secs (12.83 usr + 0.04 sys = 12.87 CPU) @ 38 +85.00/s (n=50000) Rate mybinary use_grep mybinary 3277/s -- -16% useagrep 3885/s 19% --

        I have not yet tried BrowserUK's latest edition sub yet. That's for this afternoon/evening.

        Matt