TStanley has asked for the wisdom of the Perl Monks concerning the following question:

I have the following piece of code from a program:
open(RPT,">$DIR/$RPT")||die"Can't open $RPT: $!\n"; print RPT "Report Date: $month/$day/$year\n"; print RPT "$RPT\n"; print RPT "============================\n"; my($key,$msg); foreach $key(keys %hash){ foreach $msg(@Messages){ if($hash{$key}{'error_message'}=~/$msg/ and $hash{$key}{'date'}=~ +$DATE){ print RPT "$hash{$key}{'packet_id'}\t\t$hash{$key}{'origin'}\n +"; print RPT "$hash{$key}{'date'}\t$hash{$key}{'qualifier'}\n"; print RPT "$hash{$key}{'time'}\t$hash{$key}{'terminal_id'}\n"; print RPT "$hash{$key}{'device_id'}\t\t$hash{$key}{'applicatio +n_id'}\n"; print RPT "$hash{$key}{'catalog_code'}\t\t$hash{$key}{'functio +n_id'}\n"; print RPT "$hash{$key}{'status'}\t\t$hash{$key}{'severity'}\n" +; print RPT "$hash{$key}{'error_message'}\n"; print RPT "=================================================== +=\n"; }## END OF IF STATEMENT }## END OF INNER FOREACH STATEMENT }## END OF OUTER FOREACH STATEMENT close RPT;
Is there any way I could improve the overall speed of this? I have already sped up the overall program by a minute, just from pre-compiling the regular expressions, but I would like to clean this section up as well. As usual, any ideas/suggestions are appreciated.

TStanley
--------
"Suppose you were an idiot... And suppose you were a
member of Congress... But I repeat myself." -- Mark Twain

Replies are listed 'Best First'.
Re: How could I make this code more optimized
by japhy (Canon) on Feb 06, 2002 at 20:41 UTC
    The things in @Messages should be qr//'ed regexes. That will definitely make your regex stuff go faster. (Or did you say you already did that?) And if you only use $key for subscripting the hash, why not change keys %hash to values %hash? That way you can avoid using $hash{$key} over and over and over again.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      ++! The power of aliases, but it never occured to me do that, just changed Triple syntax getopt to use it, Thanks!

      --
      perl -pe "s/\b;([st])/'\1/mg"

Re: How could I make this code more optimized
by buckaduck (Chaplain) on Feb 06, 2002 at 20:52 UTC
    1. I would use values instead of referencing keys repeatedly. I have no idea if this is any faster. Anyone?
      (Update: Well, if japhy agrees, that's good enough for me...)
    2. I would do the date test before looping through the Messages.
    3. I'd use grep instead of what you have, but probably for no good reason. If it's appropriate, looping through Messages can be further improved by making it break out early when a match is found...
    open(RPT,">$DIR/$RPT")||die"Can't open $RPT: $!\n"; print RPT "Report Date: $month/$day/$year\n"; print RPT "$RPT\n"; print RPT "============================\n"; foreach my $value (values %hash) { next unless $value->{date} =~ $DATE; next unless grep { $value->{error_message} =~ /$_/ } @Messages; print RPT ... ... } close RPT;

    buckaduck

Re: How could I make this code more optimized
by VSarkiss (Monsignor) on Feb 06, 2002 at 20:53 UTC

    Well, this won't necessarily make it faster, but it'll be a lot easier to read. Create a reference to $hash{$key} and use that instead: (note, untested)

    my ($key, $msg, $ref); # stick with the naming convention ;-) foreach $key (keys %hash) { $ref = \$hash{$key}; foreach $msg (@Messages) { if ($ref->{error_message} =~ /$msg/ and $ref->{date} =~ $DATE) { # # Notice one print call will suffice, # just line up what you want to print, # or concatenate into a single string. # print RPT "$ref->{packet_id}\t$ref->{origin}\n", "$ref->{date}\t$ref->{qualifier}\n", # etc.
    Another thing: you don't show what's in @Messages or $DATE. If either of those are constant strings, don't bother with a pattern match at all; use string comparison.

    HTH

Re: How could I make this code more optimized
by TStanley (Canon) on Feb 08, 2002 at 17:58 UTC
    Taking japhy's advice, I also qr'ed the items inside the @Messages array, and used the reference to the hash as suggested by VSarkiss.
    open(RPT,">$DIR/$RPT")||die"Can't open $RPT: $!\n"; print RPT "Report Date: $Date\n"; print RPT "$RPT\n"; print RPT "============================\n"; my($key,$msg,$ref); foreach $key(keys %hash){ $ref=$hash{$key}; foreach $msg(@Messages){ if($ref->{error_message}=~$msg and $ref->{date} eq $Date){ print RPT "$ref->{packet_id}\t$ref->{origin}\n", "$ref->{date}\t$ref->{qualifier}\n", "$ref->{time}\t$ref->{terminal_id}\n", "$ref->{device_id}\t\t$ref->{application_id}\n", "$ref->{catalog_code}\t\t$ref->{function_id}\n", "$ref->{status}\t\t$ref->{severity}\n", "$ref->{error_message}\n", "==================================================== +\n"; }## END OF IF STATEMENT }## END OF INNER FOREACH STATEMENT }## END OF OUTER FOREACH STATEMENT close RPT;
    Many thanks to all for your help.

    TStanley
    --------
    "Suppose you were an idiot... And suppose you were a
    member of Congress... But I repeat myself." -- Mark Twain

      To repeat one of my earlier suggestions:

      Why on Earth are you testing $ref->{date} eq $Date inside of the inner foreach loop? If they're not equal, skip the inner foreach loop altogether!

      Admittedly, this was probably more important back when you were using a regex match instead of eq. But it could still make a big difference...

      Update: And while I'm at it, these lines:

      foreach $key (keys %hash) { $ref = $hash{$key};
      Can be replaced with this:
      foreach $ref (values %hash) {

      buckaduck