Re^3: logfile parsing

That seems to be a good start, although I would probably use a hash (associative array) rather than a numerically-indexed array. Something like this (untested):

  use strict;
  my %ismatached = ();
  while (<FO>) {
    if (/Start Query \[ID:\s*(\d+)\s*\]/) {
      $ismatched{$1}++;
    }
    elsif (/End Query \[ID:\s*(\d+)\s*\]\s*\[duration:\s*(\d+)\s*\]/) 
+{
      my $id = $1;
      $ismatched{$id}+= 2;
      my $duration = $2;
      if ($duration > 300) {
        print "Look at Id: $id -- took longer than it should.\n";
      }
    }
  }
  foreach (sort keys %ismatched) {
    if ($ismatched{$_} == 1) {
      print "Process id $_ never ended.\n";
    }
    elsif ($ismatched{$_} == 2) {
      print "No record of process id $_ ever starting.\n";
    }
    else {
      delete $ismatched{$_};
    }
  }
  print "There were a total of ", scalar(keys %ismatched), " unmatched
+ processes.\n";
[download]

As you step through your log file, you capture the part of each line that you care about, and keep a record. Basically, you keep an entry in your hash for each process ID you encounter. If starts, it is incremented to a 1, and if it ends, it is incremented by 2. So when you are done, all the 1's are processes that started but never finished, all the 2's are processes that ended but never started, and all the 3's are processes that started and ended correctly.

You could capture the SQL in much the same way, perhaps using another hash to associate it to the process ID, or having your original ismatched hash store duration and SQL information in a subordinate hash.

Comment on Re^3: logfile parsing Download Code

Replies are listed 'Best First'.
Re^4: logfile parsing by phoneguy (Novice) on Oct 26, 2006 at 17:42 UTC
I had to change the regexps to what I needed, but it works wonderfuly! Thank you so much!	[reply]