logfile parsing

phoneguy has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: logfile parsing by Fletch (Bishop) on Oct 26, 2006 at 15:17 UTC
Write a parser. Look for "Start Query", remember the id as an open query and save the start time read the SQL and save it associated with the id in a hash when you reach an "End Query", compute the time it took and take the id out of your list of open queries lather, rinse, repeat for all lines of the logfile When you reach the end of the file, you should have a list of what finished and how long it took and what was still pending that had no matching "End Query" lines. If you want more specific than that, show your effort and what code you've tried so far.	[reply]
Re^2: logfile parsing by phoneguy (Novice) on Oct 26, 2006 at 15:36 UTC
This is what I have so far. I still need to figure out how I'm going to go through the $ismatched portion of my code...let alone if I'm doing this right. I'll be _really_ honest when I say that I don't have a clue as to what I'm doing. Programming isn't my first career choice...but, telephones are :) #!/usr/bin/perl -w use strict; my $processid; my $duration; my $ismatched; my $count; my $line; my $fo; my $filename; my $id; my $durid open(FO, $filename); while (defined ($line = <FO>)) { # Start if ($line =~ m/Start$/) { $count++; # Grab text from ID $processid =~ m/Id\:/; $ismatched[$processid] = 0; } if ($line =~ m/End$/ { $count--; $processid =~ m/ID:/; $ismatched[$processid] =1; $duration[$processid] =~ m/duration:/; if($duration > 300) { print "Look at Id: $processid -- took longer than it shoul +d.\n"; } } } # finding out processes need to go here. [download]	[reply] [d/l]
Re^3: logfile parsing by ptum (Priest) on Oct 26, 2006 at 16:51 UTC
That seems to be a good start, although I would probably use a hash (associative array) rather than a numerically-indexed array. Something like this (untested): use strict; my %ismatached = (); while (<FO>) { if (/Start Query \[ID:\s(\d+)\s\]/) { $ismatched{$1}++; } elsif (/End Query \[ID:\s(\d+)\s\]\s\[duration:\s(\d+)\s*\]/) +{ my $id = $1; $ismatched{$id}+= 2; my $duration = $2; if ($duration > 300) { print "Look at Id: $id -- took longer than it should.\n"; } } } foreach (sort keys %ismatched) { if ($ismatched{$_} == 1) { print "Process id $_ never ended.\n"; } elsif ($ismatched{$_} == 2) { print "No record of process id $_ ever starting.\n"; } else { delete $ismatched{$_}; } } print "There were a total of ", scalar(keys %ismatched), " unmatched + processes.\n"; [download] As you step through your log file, you capture the part of each line that you care about, and keep a record. Basically, you keep an entry in your hash for each process ID you encounter. If starts, it is incremented to a 1, and if it ends, it is incremented by 2. So when you are done, all the 1's are processes that started but never finished, all the 2's are processes that ended but never started, and all the 3's are processes that started and ended correctly. You could capture the SQL in much the same way, perhaps using another hash to associate it to the process ID, or having your original ismatched hash store duration and SQL information in a subordinate hash.	[reply] [d/l]
Re^4: logfile parsing by phoneguy (Novice) on Oct 26, 2006 at 17:42 UTC
Re^3: logfile parsing by phoneguy (Novice) on Oct 26, 2006 at 16:57 UTC
Ok...I think I've figured out how to get my process ID: `if($line =~/Id:\s*(\d+)/){ $processid = $1; }` [download] Does that make sense?	[reply] [d/l]
Re: logfile parsing by madbombX (Hermit) on Oct 26, 2006 at 16:05 UTC
First off, what program is generating this logfile? Someone may have already written a parser for this type of logfile if it isn't a custom rolled application. Try searching CPAN for the module. If it is a custom rolled app, then I would suggest writing an OO module that takes each line as a separate object, parses it and dumps it off to a function to handle each class/category of line and go from there. Aggregating the stats can become another part of the module.	[reply]
Re^2: logfile parsing by phoneguy (Novice) on Oct 26, 2006 at 16:28 UTC
The program is a PBX Application that was written in-house, but, the person who wrote it is no longer around, and I'm not a perl <classification>, just someone who wanted to try it out and get my feet wet. Writing an OO module is probably a bit above and beyond the scope of my skillset.	[reply]
Re: logfile parsing by wolfger (Deacon) on Oct 27, 2006 at 16:21 UTC
I have a logfile that is somewhat unique in the fact that, well, it's a jumbled up mess. And you think that's unique? Ha! Lucky you. In my experience, that's common. I would add something helpful, but I see you've already gotten what you need from another monk... -- Random Synapse Firing	[reply] [d/l]