Just looking at your sub process _it
It's nice to see you using my. but it would be so much more effective if you used it at the point/scope at which the variable is used. It would save you having to do stuff like undef($tmp_c);
$unknown is an appropriate name :). It appears nowhere else in the entire file.
You make a local copy of the @_ here, then a few lines further on, you reduce @in using grep into @data and never use @in again.
Better to reduce memory and cycles by supplying @_ directly to the grep.
A quick test shows that $count = $tmp =~ tr/@/@/; is about 4 times faster at counting occurances of a single char.
This is .. er,.. strange. If the idea is that ...&& $data[1] ) will mean the greping will only happen if the array has more than one line, too late! You already grep'd the array in the first part of the conditional. In fact, you will only get to check if there is more than one line in the array if the grep has already discovered a line containing /Error-Handler/, which presumably means the was more than one line?
Further, if the reason for the conditional is to prevent wasted cycles by greping to remove these lines unless they exists, again too late. You already grep'd to check. Ultimately, if there are no /Error-Handler/ lines, executing the code to remove them will have no effect., but at least you will only have grep'd once not twice.
foreach $line (@data) { @line = split(/:/, $line);
You take a line from an array called @data, and call it $line. split $line into an array called @line which then is referred to as $linen in the rest of the loop. Nothing wrong exactly, but mighty confusing.
Reminds me of the MP sketch about the Bruce's ;).
Why /^(\s+|)$/) instead of /^\s*$/? I vaguely remember something about (:\s+|) as an optimisation, but the fact that you were forcing the spaces or nothing to be captured and then discarded must surely be costing you more than you were saving?
That said, the value you are checking, $type, comes from the previous line $type = $states{$2}; .
Now if $2 is not a key of %states, then that will return undef so that line would me much better written as
warn "NO type: $line[0]\n" if not defined $type;
or even
warn "..." unless $type;<code> <p>which are both more efficient and clearer. <p>However, you then go on to use $type as a hash key in <p><code> $by_type{$from}{$type} += $tmp_c;
Which if you had -w or use warnings; in your code, would result in an runtime warning of something like
Use of uninitialized value in hash element at line nnn and would remove the need for issuing your own. It is also possibly a more acurate reason for the note in your code:
print <<EOF; Note: Due to the methods employed to log the mta data, the breakdown by typ +e of mail will not necessarily add up to the total messages sent from an IP. The total i +s accurate and the breakdown may be short data. EOF
I believe that the sub could be reduced to somthing like the following at least. This is completely untested, and changed by (quick) inspection only , but I think it should be close and run quite a bit quicker. As it is called many times at the heart of your application, it might make a substantial difference to your run times. I'd like to here the results.
I think you could also consolidate at least 1 if not both of the temporary count vars $tmp_c & $total_c, but I couldn't wrap my head around the logic.
sub process_it { my ($rec_line) = grep(/received from internet:/, @_); my ($from, $tmp) = ( split(/:/, $rec_line) )[3, -1]; $from =~ s/(fromhost=|$CHARS)//g; my $count = $tmp =~ s/@/@/g; my @data = grep(!/received from internet:/ and !/Error-Handler/, @ +_); my $total_c; foreach my $line (@data) { my @line = split(/:/, $line); my ($tmp_c, $type); if ($line[0] =~ /( |-)([a-zA-Z]+)$/) { $type = $states{$2}; warn "NO type: $line[0]\n" if ($type =~ /^(\s+|)$/); $tmp_c += ($type =~ /del_rem/) ? $line[$#line] =~ tr/@/@/ +: 1; } $by_type{$from}{$type} += $tmp_c; $total_c += $tmp_c; } $by_ips{$from}+= $total_c; }
A quick scan of the rest of the code suggests that some of these changes could be beneficial there too, but that's left as AEFTP.
Hope it helps some.
Examine what is said, not who speaks.
The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.
In reply to Re: The need for speed
by BrowserUk
in thread The need for speed
by l2kashe
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |