in reply to Re^5: Refactoring a large script
in thread Refactoring a large script

OK, so here's an interesting tidbit. I tried out your modifications to my WEED() sub, and it wound up doubling the time it took to run! Weird. Heres the (truncated) dprof output of my original sub:
Total Elapsed Time = 1135.701 Seconds User+System Time = 770.4718 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 63.8 491.8 491.86 1 491.86 491.86 main::WEED 32.5 251.1 251.10 22012 0.0114 0.0114 main::SEARCHFASTA 2.10 16.20 16.209 164 0.0988 0.0988 main::GET_TEXT 0.71 5.460 5.460 1 5.4600 5.4600 main::INDEX_FASTA 0.40 3.089 3.089 164 0.0188 0.0188 main::ADD_SPAN 0.15 1.140 1.140 1 1.1400 1.1400 main::WEEDNUM (......)

And here's the output with the modifications you suggested:

Total Elapsed Time = 1292.196 Seconds User+System Time = 1200.096 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 76.9 923.0 923.08 1 923.08 923.08 main::WEED 20.6 247.6 247.61 22012 0.0112 0.0112 main::SEARCHFASTA 1.30 15.57 15.579 164 0.0950 0.0950 main::GET_TEXT 0.60 7.240 7.240 1 7.2400 7.2400 main::INDEX_FASTA 0.17 2.059 2.059 164 0.0126 0.0126 main::ADD_SPAN 0.14 1.729 3.848 1 1.7292 3.8483 main::HTML_FORMAT (......)

So, I am guessing that the grep wound up being less efficient than the 'band pass' filter I had set up. Weird. That said, I have not yet tried a combination of our two different methods. I have a feeling most of your comments are going to wind up helping, but that in this case, the nature of the data just lends itself better to the filter approach.

I hope that I can still shed significant time off of this sub, so I'll post it to the "Seekers of Perl Wisdom" forum for additional help.

Thank you so much though for taking the time to try and help me fix this. I really appreciate it.

Matt

Replies are listed 'Best First'.
Re^7: Refactoring a large script
by BrowserUk (Patriarch) on Jan 23, 2007 at 15:57 UTC
      Not yet, I'll be setting that up this afternoon.

      You beat me to the punch on that one. I was already planning it. :D

      Update: Here is the original sub's run times:

      Total Elapsed Time = 1135.701 Seconds User+System Time = 770.4718 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 63.8 491.8 491.86 1 491.86 491.86 main::WEED 32.5 251.1 251.10 22012 0.0114 0.0114 main::SEARCHFASTA

      And here it is without the switch to grep, but keeping the sorting fixes you suggested, as well as most of your other helps:

      Total Elapsed Time = 789.7615 Seconds User+System Time = 741.4615 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 65.4 485.3 485.35 1 485.35 485.35 main::WEED 33.5 248.8 248.86 22012 0.0113 0.0113 main::SEARCHFASTA 0.96 7.120 7.120 1 7.1200 7.1200 main::INDEX_FASTA

      So, removing the double sorting shaved off 6.5 seconds. It's a start.. Now, to try and shave off another 300 or so more....

      Matt