cjacksonjr has asked for the wisdom of the Perl Monks concerning the following question:

I wrote a program to read an alarm log. The foreach loop is very slow (takes 2 days to run) because there are 31,197 alarms each analyzing 155 Quench Times. I wrote sub GetQuenchInfo to create a hash to try to speed up... but I am having trouble understanding how to call it or how to use hash..
foreach $supply (@supplies) { $AlarmCount++ if(($supply->[4] - $supply->[3]) <= $AlarmFilter) +; my $dbEvents = new Sybase::CTlib 'ops','opsops','OPSYB1','fillev +entsT'; $dbEvents->ct_sql("use rhic_au_fy01_fill"); my $sql = "SELECT * FROM fillEventsT WHERE rhicTime like 'Oct%' and (event like 'ev-bquenc +h' or event like 'ev-yquench')"; my(@fills,$fill); @fills = $dbEvents->ct_sql($sql); foreach $fill (@fills) { # how many alarms occur w/in (Y) second +s of a QUENCH EVENT? $QuenchCount++ if(($supply->[3] - $fill->[0]) <= $QuenchFilter) an +d (($supply->[3] - $fill->[0]) > 0); } } sub GetQuenchTimes { # put quench times in a hash use strict; my $dbEvents = new Sybase::CTlib 'harmless','harmless','OPSYB1','fil +leventsT'; $dbEvents->ct_sql("use rhic_au_fy01_fill"); my $sql = "SELECT rhicTimeUS, rhicTime FROM fillEventsT WHERE rhicTime like 'Oct%' and (event like 'ev-bquench' o +r event like 'ev-yquench')"; my(@times,$time); @times = $dbEvents->ct_sql($sql); foreach $time (@times) { $rhicTimeUS{$time->[0]}; print "<$time->[0]> \n"; } }

Replies are listed 'Best First'.
Re: how to speed up program?
by dws (Chancellor) on Aug 16, 2002 at 16:46 UTC
    The foreach loop is very slow (takes 2 days to run) because there are 31,197 alarms each analyzing 155 Quench Times. I wrote sub GetQuenchInfo to create a hash to try to speed up...

    You made two mistakes here: First, you started optimizing within a loop before you considered what you could move out of the loop. Second, you began pursuing a particular optimization (creating a hash) before you'd gathered data on where your script was spending its time.

    Making a database connection and running the queries can be moved out of the loop, since nothing they do depends on $supply, which is the only thing varying in that loop. (Is this intentional, or a bug?) But for a moment, let's pretent they can't be moved out of the loop.

    Since you know the performance problem is in the loop, a reasonable approach is to pick the loop apart, looking for where the script might be spending time. Then add some simple timing code to gather data. For example:

    my $aboutToConnect = time(); # start timing my $dbEvents = new Sybase::CTlib 'harmless','harmless','OPSYB1','fil +leventsT'; print "Sybase::CTlib took ", time() - $aboutToConnect, " seconds to connect\n";
    You might then have discovered that making a new database connection is relatively expensive.

    Ditto for timing the queries.

    If you discover that 99% of your time is going to database work, then trying to improve performance by using a hash is pointless, unless reducing that 1% to 0.5% is going to pay off.

      Um, time has only 1 second resolution. For him to discover the time wasted, he needs to install Time::HiRes and then call gettimeofday.
Re: how to speed up program?
by Zaxo (Archbishop) on Aug 16, 2002 at 15:56 UTC

    You appear to be opening a new db connection and reading the same constant array each time through the loop. Try moving that part outside the loop.

    After Compline,
    Zaxo