Performance revision needed

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Performance revision needed by pfaut (Priest) on Jan 02, 2003 at 16:32 UTC
`if ($Conf[7] =~ /NO/i) { ... elsif($Conf[7] =~ /YES/i) {` [download] @Conf appears to be setup outside of this loop. Evaluate this once and save the result instead of parsing a regex every time through the loop. `$attrID =~ s/\s//g; $port =~ s/\s//g;` [download] I think replacing the '*' with '+' might help here. `$val = join("", $val, $port);` [download] How about `$val .= $port`? `push @Data, [@entry];` [download] Since @entry is a lexical variable, you should be able to use \@entry instead of [@entry] which needs to make a copy of the list. Unless I'm missing something, both blocks of code do the same thing. It might be better from a maintenance standpoint to check if either condition is true and only have the block once. `--- print map { my ($m)=1<<hex($_)&11?' ':''; $m.=substr('AHJPacehklnorstu',hex($_),1) } split //,'2fde0abe76c36c914586c';` [download]	[reply] [d/l] [select]
Re: Performance revision needed by gjb (Vicar) on Jan 02, 2003 at 16:37 UTC
Just a few things that come to mind: `if ($port =~ /\-/) {...}` can be replaced by `$port =~ s/\-//g;` you can simply `push @Data, [$Time, $mth, $val, $value];` It may be faster to do `($year, $month, $day) = (substr($date, 0, 4), substr($date, 4, 2), substr($date, 6));` to split the date if `$Conf[7]` seems to be set outside the while, so you could go for a simpler `if` statement rather than do the same regex each 1.5 million times through the loop or even forego the conditional statement in the loop completely you might have a look at the various Date and Time modules on CPAN that might do a faster job on date/time conversions Hope this helps, -gjb-	[reply] [d/l] [select]
Re: Re: Performance revision needed by Anonymous Monk on Jan 02, 2003 at 18:06 UTC
great ideas... you wouldn't happen to have a suggestion for some revised conditional statements would you...I've been staring at this code too long.... Thanks again	[reply]
Re: Performance revision needed by talexb (Chancellor) on Jan 02, 2003 at 16:47 UTC
As I understand ir, using the 'i' modifier in Perl slows down the regexp code. You might benchmark using /YES/ \|\| /yes/ instead. You appear to be using the s operator to remove any spaces. My guess is that you can replace that with tr/ //d instead, and get a speed improvement. Finally, it looks like the two blocks of code are identical. Can you instead check for (NO and $value not zero) or YES? to do that block of code? --t. alex Life is short: get busy!	[reply]
Re: Performance revision needed by rir (Vicar) on Jan 02, 2003 at 17:41 UTC
I would not call the following optimization. You don't really give enough info to optimize your code. The following just eliminates some wasted cycles and stylistically improves the code. This should improve performance also. This is untested off the cuff code #!/usr/bin/perl use warnings; use strict; # move declarations out of loop # my ( $date, $atime, $etime, $mth, $port, $attrID, $value ); my ( $HH, $MM, $SS ); my ( $year, $month, $day ); my @entry ; while (<SSLOG>) { # chomp one item not seven chomp; ($date, $atime, $etime, $mth, $port, $attrID, $value) = split /;/, $_ ; # combine conditions into one # could be optimized by knowing the data # and short circuiting # if ( ($Conf[7]=~ /NO/i) && ($value!= /^0/) or ( $Conf[7] =~ /YES/i) ) { # Move time computation inside conditional # # Convert the log time to UTC time ( $HH, $MM, $SS ) = split ( /:/, $etime ); ( $year, $month, $day ) = $date =~ m/^(.{4})(.{2})(.)/; $month -= 1; $Time= timelocal($SS, $MM, $HH, $day, $month, $year); # Build a hash of MTypeHandles with their unique ports $Hash{$mth}{$port}++; $attrID =~ s/\s//g; $port =~ s/\s*//g; # Remove unnecessary condition # # if ( $port =~ /\-/ ) { # $port =~ s/\-//; # } $port =~ s/\-//; #remove unnecessary variable # #$key = $attrID; #$val = $DS_Info{$key}; # $val = $DS_Info{$attrID}; #Remove unnecessary join. #$val = join ( "", $val, $port ); # $val .= $port; # Remove unnecessary variable # @entry = ( $Time, $mth, $val, $value ); # # This is a big array you are making # What do you do with it? # Is there any other choice # # Populate the data array # push @Data, [@entry]; # push @Data, [ $Time, $mth, $val, $value ]; } } [download]	[reply] [d/l]
Re: Re: Performance revision needed by Anonymous Monk on Jan 02, 2003 at 19:40 UTC
Thank you rir, Your suggestions and a combination of the others have improved performance greatly. Thanks again rir, and others who who posted suggestions	[reply]
Re: Performance revision needed by Thelonius (Priest) on Jan 02, 2003 at 18:45 UTC
Many of these suggestions are good, but they are all micro-optimizations. I suspect that your problem is that you are using a lot of memory. Try commenting out the lines that save the data, specifically, `$Hash{$mth}{$port}++; # and especially: push @Data, [@entry];` [download] Then run it and see how fast it is. If it is fast, then you should look into getting rid of @Data and maybe $Hash. It may be that $Hash isn't that large--it's hard to tell from your code. (Also, don't you want to edit $port before you increment $Hash{$mth}{$port}?) The easiest thing to do is probably write ($Time, $mth, $val, $value) out to a new file and process that in a second pass just as you now process @Data. Or, possibly, you can do it in one pass and not save the data at all.	[reply] [d/l]
Re: Performance revision needed by Nitrox (Chaplain) on Jan 02, 2003 at 16:19 UTC
Update: Thanks to IlyaM for pointing out my mis-choice of modules and explaining how Devel::SmallProf is a better choice than Devel::DProf. -Nitrox	[reply]
Re: Re: Performance revision needed by IlyaM (Parson) on Jan 02, 2003 at 17:29 UTC
This advice is not very helpful in this case. Devel::DProf collects statistics for subroutine calls but this script doesn't have them. Devel::SmallProf which is line oriented profiler is more useful in such cases. -- Ilya Martynov, ilya@iponweb.net CTO IPonWEB (UK) Ltd Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net Personal website - http://martynov.org	[reply]
Re: Performance revision needed by poj (Abbot) on Jan 02, 2003 at 19:33 UTC
Taking a guess at what your records looked like `20020202;10:10:10;12:12:12;12;234;attid ;1000` I created a 500k line file and ran the code suggested by rir on my winXP 1.1GHz 256Mb laptop and it took 57 seconds. I realise that because all the records are the same this is not a very scientific test. Without the timelocal call it was 26 seconds. However, 19 minutes does seems a long time. Maybe memory or disk IO is having an significant effect. poj	[reply] [d/l]


Do you know where your variables are?
	PerlMonks