Thanks to all about the tr/// as opposed to s///, I hadn't come up with decent way to deal with how many recipients a message had, and I remembered a post here that used =~ s/// to count I think it was dots. So I used it cause it fit
On the $id thing. yes the appropriate regex is
$id = $1 if (/msgid=<[^>]+>/);
as the chars <> will never be valid chars in the ID.
The reason for the multiple greps. I have a data set, lets say in instance 1 its 3 lines, in instance 2 is 2 lines. In instance 1 we have
1 received mail line
1 Error-Handler line
1 bounced line
In instance 2 we have
1 received line
1 Error-Handler line
What I am attempting to deal with is those disparate lines. So I take the received line from the data set. Then if there is more than 1 item left in the data set, and Error-Handler lines are also in the data set I remove them. If there aren't then I leave it alone cause I need to count that the message came through, it was just processed outside the scope of the log file itself.
On the whole conditional against $id, it really is just style. If I am only doing 1 thing based on truth, I inline it, if not I use the braces.
Thanks for the pointer about not capturing if Im not using it. Makes sense
And as for your map usage, I am still a map newbie. Though I will definately see what I can get out of it in terms of mileage. Thanks for the input. oh yeah.. in terms of tossing the data away. If I dont I run out of memory. I need to only process one file, extract the relevant data, then I need to clean my %data out or I simply dont have any memory left. As you can see from the data samples the lines are long. It amazing how your paradigm shifts along with the size of your data set :P
/* And the Creator, against his better judgement, wrote man.c */