Some fun for the weekend.. I thought people might like to join in my code refactoring (I was just supposed to "make it work" for oracle, but somehow I got carried away, you will see why.. )

Can you simplify the following:

sub LastDalLoad { my $ContentsLoadMsg = shift; my $logFile = "$ENV{ MIS_ROOT }/log/loadmsg_1.log"; my @Result; my $Command; my $indLines; my $finishedLine; my $beginningLine; my $found; my @beginningTime; my @finishedTime; my $maxSecondsAnHour; my @datePart; my @checkDate; my @checkTime; my $beginningInSeconds; my $finishedInSeconds; my $actualInSeconds; my $timeDiff; my $startDiff; my $status = $TRUE; # # Check the status of MIS as stated in mistrap.log # if( ( $statusLoadmsg = &FileExists( $logFile ) ) == $FALSE ) { &ResCode("WARNING"); &BrowseCode("File \"$logFile\" not found"); 
 &SummaryCode("WARNING"); return $WARNING; } # # read whole log file # if( &readFileContents( $logFile, $ContentsLoadMsg ) == $FALSE ) { &ResCode("E_LOADMSG_ERROR"); : &BrowseCode("Could not open \"$logFile\""); 
 &SummaryCode("E_MISLOAD_ERROR"); return $FALSE; } # # find out last load # $finishedLine = ""; $beginningLine = ""; $indLines = $#{ $ContentsLoadMsg }; while ( $indLines > -1 and ( length( $finishedLine ) == 0 or length( $beginningLine + ) == 0 ) ) { if( ${ $ContentsLoadMsg }[ $indLines ] =~ /The utility has fi +nished/ ) { $finishedLine = ${ $ContentsLoadMsg }[ $indLines ] . " " +. ${ $ContentsLoadMsg }[ $indLines + 1 ]; } elsif( ${ $ContentsLoadMsg }[ $indLines ] =~ /The utility is +beginning/ ) { $beginningLine = ${ $ContentsLoadMsg }[ $indLines ] . " " + . ${ $ContentsLoadMsg } [ $indLines + 1 ]; } $indLines --; } # # last load # @beginningTime = &getLineParts( $beginningLine, 4, 4, "\"" ); @finishedTime = &getLineParts( $finishedLine, 4, 4, "\"" ); # # get actual day and time # ( $day, $month, $year ) = &getActualDate; ( $hour, $minute, $second ) = &getActualTime; # get max seconds of a day $maxSecondsAnHour = &getTimeInSeconds( 23, 59, 59 ) + 1; # # get values of DAL load start : # # get pure date part @datePart = &getLineParts( $beginningTime[0], 1, 2 ); @checkDate = &getLineParts( $datePart[0], 1, 3, "-" ); # get pure time part @checkTime = &getLineParts( $datePart[1], 1, 1, "." ); @checkTime = &getLineParts( $checkTime[0], 1, 3, ":" ); # get time of DAL load start in seconds $beginningInSeconds = &getTimeInSeconds( $checkTime[0], $checkTim +e[1], $checkTime[2] ); # get actual time in seconds $actualInSeconds = &getTimeInSeconds( $hour, $minute, $second +); # get difference in days of actual day and last DAL file load st +art $Command = "db2 \"values (julian_day('$year-$month-$day') - "; $Command .= " julian_day('$checkDate[2]-$checkDate[0 +]-$checkDate[1]'))\" "; if( &executeSQL( $Command, \@Result ) == $FALSE ) { &ResCode("E_DB2_ERROR"); &BrowseCode("It was not possible to execute an SQL command"); 
 &SummaryCode("E_MISLOAD_ERROR"); return $FALSE; } # get the value of the previous SQL call for days difference $startDiff = &getFirstNonBlank( $Result[3] ); # is DAL file from same day if( $startDiff == 0 ) { $timeDiff = $actualInSeconds - $beginningInSeconds; # have more than MIS_MAX_DALLOADTIMEPASS_H hours gone after +the last DAL files was l oaded if( $timeDiff > &getTimeInSeconds( $Configuration{ MIS_MAX_DA +LLOADTIMEPASS_H }, 0, 0 ) ) { &ResCode("WARNING"); &BrowseCode("Last DAL load operation was started more tha +n $Configuration{ MIS_MA X_DALLOADTIMEPASS_H } hours ago"); $status = $WARNING; } else { &ResCode("OK"); &BrowseCode("Last DAL load operation started at @beginnin +gTime"); } } # is DAL file from yesterday ( check if time difference over mid +night is within range ) elsif( $startDiff == 1 ) { $timeDiff = $maxSecondsAnHour - $beginningInSeconds; $timeDiff += $actualInSeconds; # have more than MIS_MAX_DALLOADTIMEPASS_H hours gone after +the last DAL files was l oaded if( $timeDiff > &getTimeInSeconds( $Configuration{ MIS_MAX_DA +LLOADTIMEPASS_H }, 0, 0 ) ) { &ResCode("WARNING"); &BrowseCode("Last DAL load operation was started more tha +n $Configuration{ MIS_MAX_DALLOADTIMEPASS_H } hours ago"); $status = $WARNING; } else { &ResCode("OK"); &BrowseCode("Last DAL load operation started at @beginnin +gTime"); } } # DAL file is more than one day old else { &ResCode("WARNING"); &BrowseCode("Last DAL load operation started at @beginningTim +e"); $status = $WARNING; } # # get values of DAL load end # # get pure date part @datePart = &getLineParts( $finishedTime[0], 1, 2 ); @checkDate = &getLineParts( $datePart[0], 1, 3, "-" ); # get pur time part @checkTime = &getLineParts( $datePart[1], 1, 1, "." ); @checkTime = &getLineParts( $checkTime[0], 1, 3, ":" ); # get time of DAL load start in seconds $finishedInSeconds = &getTimeInSeconds( $checkTime[0], $checkTime +[1], $checkTime[2] ); # get difference in days of actual day and last DAL file load st +art $Command = "db2 \"values (julian_day('$year-$month-$day') - "; $Command .= " julian_day('$checkDate[2]-$checkDate[0 +]-$checkDate[1]'))\" "; if( &executeSQL( $Command, \@Result ) == $FALSE ) { &ResCode("E_DB2_ERROR"); &BrowseCode("It was not possible to execute an SQL command"); 
 &SummaryCode("E_MISLOAD_ERROR"); return $FALSE; } # get the value of the previous SQL call for days difference $startDiff = &getFirstNonBlank( $Result[3] ); # is DAL file from same day if( $startDiff == 0 ) { $timeDiff = $actualInSeconds - $finishedInSeconds; # have more than MIS_MAX_DALLOADTIMEPASS_H hours gone after +the last DAL files was l oaded if( $timeDiff > &getTimeInSeconds( $Configuration{ MIS_MAX_DA +LLOADTIMEPASS_H }, 0, 0 ) ) { &ResCode("WARNING"); &BrowseCode("Last DAL load operation ended more than $Con +figuration{ MIS_MAX_DALLOADTIMEPASS_H } hours ago"); $status = $WARNING; } else { &ResCode("OK"); &BrowseCode("Last DAL load operation ended at @finishedTi +me"); } } # is DAL file from yesterday ( check if time difference over mid +night is within range ) elsif( $startDiff == 1 ) { $timeDiff = $maxSecondsAnHour - $finishedInSeconds; $timeDiff += $actualInSeconds; # have more than MIS_MAX_DALLOADTIMEPASS_H hours gone after +the last DAL files was loaded if( $timeDiff > &getTimeInSeconds( $Configuration{ MIS_MAX_DA +LLOADTIMEPASS_H }, 0, 0 ) ) { &ResCode("WARNING"); &BrowseCode("Last DAL load operation ended more than $Con +figuration{ MIS_MAX_DALLOADTIMEPASS_H } hours ago"); $status = $WARNING; } else { &ResCode("OK"); &BrowseCode("Last DAL load operation ended at @finishedTi +me"); } } # DAL file is more than one day old else { &ResCode("WARNING"); &BrowseCode("Last DAL load operation ended at @finishedTime") +; $status = $WARNING; } 
 if( $status == $TRUE ) { &SummaryCode("OK"); } elsif( $status == $WARNING ) { &SummaryCode("WARNING"); } else { &SummaryCode("E_MISLOAD_ERROR"); } return $status; }
I'll post my version (that includes the oracle bits), as a reply.. My favourite bit is "using DB2 as a date calculator". Ah, and caveat, no extra modules.. (Yes, yes, I know, but lets not go into that, shall we.. )

Update Oops, should post data sample maybe. The point of the exercise appears to be, get the last timestamp that a file was loaded into the database, and check its not older than MIS_MAX_DALLOADTIMEPASS_H hours. Sample data:

SQL3109N The utility is beginning to load data from file "/home/mis/jer/tMIS/MIS.05.23-TIX2/dalfiles/loadfiles//20050404_120000 +_". SQL3500W The utility is beginning the "LOAD" phase at time "04-29-200 +5 12:26:10.994507". SQL3112W There are fewer input file columns specified than database c +olumns. SQL3519W Begin Load Consistency Point. Input record count = "5". SQL3520W Load Consistency Point was successful. SQL3110N The utility has completed processing. "7" rows were read fr +om the input file. SQL3519W Begin Load Consistency Point. Input record count = "7". SQL3520W Load Consistency Point was successful. SQL3515W The utility has finished the "LOAD" phase at time "04-29-200 +5 12:26:11.761926".
Update2 In case it wasnt clear, I didnt write this thing originally, I just maintain it.. Which means only change stuff if a) not working, or b) needs new functionality. This means, in particular, that although I would like to go through and remove all the junk like $TRUE, FileExists and other nonsensical pieces, I won't be given the time unless some customer needs it rewritten for whatever reason. (We are currently understaffed as a group of two, and booked up until August or so..)

C.

Replies are listed 'Best First'.
Re: Code refactoring challenge
by merlyn (Sage) on Apr 29, 2005 at 16:11 UTC
    my ... my ... my ... my ... my ...
    Any time I'm being paid to sit down and do a code review, I dread seeing one of these. This is almost always a red flag that what follows will be monolithic, hard to understand, hard to debug, hard to test code, written like we've tossed everything out that we learned about procedural programming in the past twenty years. Go look at the way I typically write code as an alternative.

    Just my first take. Didn't get any further for this reason.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: Code refactoring challenge
by dragonchild (Archbishop) on Apr 29, 2005 at 17:43 UTC
    I'm with merlyn on this one re: my() at the beginning. A few other thoughts:
    • Get rid of the global variables and environment variables. If you have to use global values, use a Singleton that acts as a configuration object. That way, you can trap who's using it and how. This is important if someone accidentally does something like $MISValue{ MIS_DBTYPE } = 'db2' because they're used to a language that does context-sensitive comparisons. Oops!

      Much better is

      my $config = GlobalConfig->new; # compile-time error if ( $config->mis_dbtype = 'db2' ) { ... } # This works if ( $config->mis_dbtype == 'db2' ) { ... }
    • if( &readFileContents( $logFile, $ContentsLoadMsg ) == $FALSE ) { &ResCode("E_LOADMSG_ERROR"); &BrowseCode("Could not open \"$logFile\""); 
 &SummaryCode("E_MISLOAD_ERROR"); return $FALSE; }

      Get rid of the ampersands. Also, why haven't you wrapped the contents of this if-block in a subroutine?

      sub do_logging { my ($rescode, $browsecode, $summarycode) = @_; ResCode( $rescode ); BrowseCode( $browsecode ); NewLine(); SummaryCode( $summarycode ); } if ( readFileContents( $lofFile, $ContentsLoadMsg ) == $FALSE ) { do_logging( 'E_LOADMSG_ERROR', 'Could not open "$logFile"', 'E_MIS +LOAD_ERROR' ); return $FALSE; }
    • my ($bday, $bmon, $byear, $bhour, $bmin, $bsec); my ($eday, $emon, $eyear, $ehour, $emin, $esec); if(!$MISValue{ MIS_DBTYPE } || $MISValue{ MIS_DBTYPE } eq 'db2') { ($bday, $bmon, $byear, $bhour, $bmin, $bsec) = $beginningLine =~ /"(\d){2}-(\d){2}-(\d){4} (\d){2}:(\d){2} +:(\d){2}/; ($eday, $emon, $eyear, $ehour, $emin, $esec) = $finishedLine =~ /"(\d){2}-(\d){2}-(\d){4} (\d){2}:(\d){2}: +(\d){2}/; } else { ($byear, $bmon, $bday, $bhour, $bmin, $bsec) = $beginningLine =~ /^(.){10} (\d){4}-(\d){2}-(\d)}-(\d){2}\. +(\d){2}\.(\d){2}/; ($eyear, $emon, $eday, $ehour, $emin, $esec) = $finishedLine =~ /^(.){10} (\d){4}-(\d){2}-(\d)}-(\d){2}\.( +\d){2}\.(\d){2}/; }
      DateTime, Date::Calc::Object, Date::Manip ... use them!
    • $finishedLine = ${ $ContentsLoadMsg }[ $ind ] . " " . ${ $ContentsLoad +Msg }[ $ind + 1 ];
      is better written as
      $finishedLine = join ' ', @{$ContentsLoadMsg}[ $ind, $ind + 1 ];

    The Perfect is the Enemy of the Good.

Re: Code refactoring challenge
by tilly (Archbishop) on Apr 29, 2005 at 18:39 UTC
    Some fun for the weekend.. I thought people might like to join in my code refactoring (I was just supposed to "make it work" for oracle, but somehow I got carried away, you will see why.. )

    Can you simplify the following: (...big snip...)

    That sounds too much like real work.

Re: Code refactoring challenge
by castaway (Parson) on Apr 29, 2005 at 16:02 UTC
    My solution (note probably not brilliant, but lots shorter and works :): .. there.. and that "module" is now a whole 3k shorter than it was, and no doubt a deal faster..

    C.

Re: Code refactoring challenge
by cazz (Pilgrim) on Apr 29, 2005 at 17:53 UTC
    There are a couple of other things that I don't care for in this example in addition to the large list of my my my's.

    • Calling functions with &. It isn't needed in most cases, none of which I see in this code.
    • Generating raw SQL without binding columns. There are very few cases where you should not bind columns. This isn't one of them.
    • Using the results of $#{ $data } with a while loop. foreach, last and next are your friends.
      No, & isnt needed, but it's already in all the rest of the code (9 .pm files totalling 600k), so removing them all would be a bit of work, I'd rather stay consistent for now.

      Ah, but you missed the entire point there, the SQL isnt actually needed at all, it's just using the database engine as a caculator, so I removed it completely.

      If you look, you'll see I did change that to a for-loop. (Although admittedly it could have been better still), the index is needed because it grabs the next line after the one we are looking at, too.

      C.

Re: Lawn refactoring challenge
by Anonymous Monk on May 01, 2005 at 00:24 UTC
    In case any monk is in the area, I'm having a lawn refactoring challenge. You mow my yard, I'll tell you if its good enough. Also coming soon, the roofing-my-house refactoring challenge and paint-the-interior-of-my-house refactoring challenge. I look forward to your participation.
Re: Code refactoring challenge
by Anonymous Monk on Apr 29, 2005 at 21:51 UTC
    First things first. You'll have to explain to us why we're trying to reinvent the wheel with functions like "FileExists"...
    if( ( $statusLoadmsg = &FileExists( $logFile ) ) == $FALSE ) { &ResCode("WARNING"); &BrowseCode("File \"$logFile\" not found"); 
 &SummaryCode("WARNING"); return $WARNING; }
    Maybe...
    do {warn "$logfile not found"; return $warning} unless -e $logfile;
    would do?
    # # read whole log file # if( &readFileContents( $logFile, $ContentsLoadMsg ) == $FALSE ) { &ResCode("E_LOADMSG_ERROR"); : &BrowseCode("Could not open \"$logFile\""); 
 &SummaryCode("E_MISLOAD_ERROR"); return $FALSE; }
    ...Hmm, is "readFileContents" any different than $contents = <$logfile>;(with $/=undef)? Then of course we have to ask why bother with the exsistance check in the first place. Oh, and what's up with all the $TRUE and $FALSE. Looks ugly and error prone to me. And by the description in the update, I'd guess you should be able to do what you want in less than 10 lines of code.
      I know the code is crap. That's why I went to rewrite it! I didn't write it myself, I just maintain the bugger. Yes, its fugly, but its gotten the job done for the last year or two, (although adding even "use warnings" makes a pretty mess) .. I think this guy was a perl4ish programmer.

      Whatever, the exercise was not to pick holes in the crap that's already there, but write those 10 lines you claim ;)

      C.

        Whatever, the exercise was not to pick holes in the crap that's already there, but write those 10 lines you claim
        Well, the easiest thing would be to start over I think. Why don't you post the desired output for the data sample you provided (instead of having us try to compile and run the original program in our heads).