Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm trying to write up a script, but don't know from where to start. We have an apps, and whenever an issue occurs, we see lot of processing for a given user, this repeated processing gets captured into the logs. Now, the idea is to come up with a script, that will process log file of any given day, an send an alert in case of repeated processing. The concerned row is something like given below, what we need is uid=xx1234, and it's occurrence value. The script should be designed as such to put in threshold value, like 10, 20 depending on when to send an alert.  20100120 05:00:02 37a0045b <ABCD:ABXT> From MV Modify <uid=xy1234,ou=Internal,ou=people,dc=xyz,dc=example,dc=com> There are lot of work to do on my part - like designing scripts for number of patter/files/occurences. What i really want is way to find a way to accomplish this. Please let me know if you want additional info. Thanks in Advance. Regards, Pamela

Replies are listed 'Best First'.
Re: Script Help
by Jenda (Abbot) on Jan 21, 2010 at 22:29 UTC

    You can either start by learning a programming language (Perl would be a fine choice and the code for such a task would be fairly simple.) or by hiring someone who already invested some time in learning a programming language. and if you eventually do actually start doing anything, do not overdesign it. You do not need to create one generic script, it's fine to have a different (five line) script for each file/pattern type.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Re: Script Help
by AnomalousMonk (Archbishop) on Jan 21, 2010 at 19:49 UTC

    If the text within your OP represents a single line from a closed log file, i.e., if you are not trying to tail a constantly growing log file, the following might represent a possible approach to a solution (untested):

    use warnings; use strict; my $log_file = 'file.log'; my $uid = 'xy1234'; my $threshold = 20; my @occurrences; open my $fh_log, '<', $log_file or die "opening $log_file: $!"; LINE: while (defined(my $line = <$fh_log>)) { next LINE unless $line =~ m{ uid=$uid }xms; push @occurrences, $line; } close $fh_log or die "closing $log_file: $!"; if (@occurrences > $threshold) { print "occurrences for $uid above $threshold \n"; print @occurrences; }

    Of course, the  print statements at the very end will have to be replaced with code to compose and send an appropriate e-mail message, etc., but I assume you know how to do that. (In any event, I'm not the best one to ask about that.)

      Thanks for the response, yes i can configure the mail-sending part of it. Here are my comments wrt your suggested script : - Although the log file is growing, but the alert would be based on snap shot of current content. - $uid , we cannot set this variable, and we're not sure of the user, that is the first thing we need to find. I mean logically, the script will scan through log file, pick every uid, then count number of occurrences, if the occurrence goes above the threshold, send an alert. Please help me in this regard. Regards, Pamela

        Again assuming you are processing a static file, I would tend to attack the problem this way:

        • Define a regex to recognize and extract a unique UID. Something like
              my ($uid) = $line =~ m{ <uid= (\w+) }xms;
          seems like a good first guess.
        • Define a regex to recognize (and, if necessary, to extract) whatever an 'occurrence' may be.
        • If both a UID and an occurrence occur in a line, increment a UID count as appropriate to the occurence. This may be as simple as
              my %occurrences;
              # enter line processing loop
              # ..., recognize occurrence, extract UID from line, ...
              $occurrence{$uid}++;
              # process next line, etc.
          if all occurrences have equal weight.
        • When all lines in the file are processed, loop through the hash and send an appropriate message for each UID for which the count of occurrences exceeds the specified threshold.