Logfile analysis : How to differentiate records with timestamp

tuakilan has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I try to write a perl script that reads a log file, locate records which has same id number, but the time difference is more than 1 hour within the same day.

Herewith are the log file which was generated by a close source C program which was beyond my knowledge

ReadLog.txt

<b>2008-Feb-01 00:00:02 UTC (GMT +0000) - id = 000000001, Status = fai
+led</b>
2008-Feb-01 00:10:02 UTC (GMT +0000) - id = 000000002, Status = succes
+s
<b>2008-Feb-01 00:20:02 UTC (GMT +0000) - id = 000000003, Status = fai
+led</b>
2008-Feb-01 00:30:02 UTC (GMT +0000) - id = 000000004, Status = succes
+s
2008-Feb-01 00:40:02 UTC (GMT +0000) - id = 000000001, Status = succes
+s
2008-Feb-01 00:50:02 UTC (GMT +0000) - id = 000000001, Status = succes
+s
<b>2008-Feb-01 01:00:02 UTC (GMT +0000) - id = 000000001, Status = fai
+led</b>
2008-Feb-01 01:10:02 UTC (GMT +0000) - id = 000000007, Status = succes
+s
<b>2008-Feb-01 01:20:02 UTC (GMT +0000) - id = 000000003, Status = fai
+led</b>
2008-Feb-01 01:30:02 UTC (GMT +0000) - id = 000000009, Status = succes
+s
[download]

ResultLog.txt

ID               last attempted           attempts
============================================================
000000001        2008-Feb-01 01:00:02     1
000000003        2008-Feb-01 01:20:02     1
[download]

I tried the following but the output wasn't desirable.

use strict;
use warnings;
use DateTime::Format::Strptime;
# my $Strp = new DateTime::Format::Strptime(pattern     => '%Y-%b-%d %
+T',);
my $Strp = new DateTime::Format::Strptime(pattern     => '%Y-%b-%d %T'
+, on_error => 'croak');
my $infile = 'ReadLog.txt';
my $outfile = 'ReportLog.txt';
my($fh_out, $fh);
my %lookup;
my $status = 'failed';
my $time_delta = 3600;  # seconds = 1 hour
open($fh_out, '>', $outfile) or die "Could not open outfile: $!";
open($fh, '<', $infile) or die "Could not open logfile: $!";
while (<$fh>) {
    next unless /$status/;
    $_ =~ m/^(.*) UTC.*refs = (\d+)$/;
#    my $dt = $Strp->parse_datetime("$1");
    unless (my $dt = $Strp->parse_datetime($1))
    {
     warn "invalid datetime: $1";
     next;
    }
    my $timestamp = $dt ->epoch();
    my $refs = "$2";
    if ( defined($lookup{$refs}) && $lookup{$refs} + $time_delta <= $t
+imestamp ) {
        print $fh_out "REFS $refs: occurrences at " . $lookup{$refs} .
+ "and $timestamp \n";
        print "REFS $refs: occurrences at " . $lookup{$refs} . " and $
+timestamp \n";
    }
    $lookup{$refs} = $timestamp;
}
close $fh_out;
#close $fh;
[download]

Comment on Logfile analysis : How to differentiate records with timestamp Select or Download Code

Replies are listed 'Best First'.
Re: Logfile analysis : How to differentiate records with timestamp by moritz (Cardinal) on Mar 11, 2008 at 08:09 UTC
The output wasn't desirable because your program doesn't even compile. You haven't declared `$channel` and `$dt` (the latter is declared in a comment). The missing regex `$channel` should be your first shot, if you've got that, you can use `$1` to create `$dt`	[reply] [d/l] [select]
Re: Logfile analysis : How to differentiate records with timestamp by hipowls (Curate) on Mar 11, 2008 at 10:01 UTC
This produces the required output but I'm not convinced that it does what is required. "last attempted" is actually the time of the first attempt in that period "attempts" is the number of additional failed attempts, if only one failure occurs it will be logged as 0 attempts. Anyway they're not my problems, I had fun with the regex;-) #!/net/perl/5.10.0/bin/perl use strict; use warnings; use 5.010_000; use DateTime::Format::Strptime; use Readonly; Readonly my $delta => 60 * 60; my %stats; my $log_parser = qr{ \A (?<time_stamp> \d{4}-\p{IsAlpha}{3}-\d\d [ ] \d\d:\d\d:\d\d ) .* [ ] - [ ] id [ ] = [ ] (?<id> \d+ ), [ ] Status [ ] = [ ] (?<status> \w+ ) }msx; my $date_parser = new DateTime::Format::Strptime( pattern => '%Y-%b-%d %T', on_error => 'croak', ) or die "Can't create date parser: $!\n"; while ( my $line = <DATA> ) { if ( $line =~ /$log_parser/ ) { my ( $time_stamp, $id, $status ) = @+{qw(time_stamp id status) +}; next if $status eq 'success'; my $date = $date_parser->parse_datetime($time_stamp); \|\| $stats{$id}[-1][0]->subtract_datetime_absolute($date)-> +seconds > $delta ) { push @{ $stats{$id} }, [ $date, $time_stamp, 0 ]; } else { ++$stats{$id}[-1][2]; } } } if ( scalar keys %stats ) { say 'ID last attempted attempts'; say '=' x 44; foreach my $id ( sort keys %stats ) { foreach my $time ( @{ $stats{$id} } ) { printf "%s %s %8d\n", $id, @{$time}[ 1, 2 ]; } } } __DATA__ 2008-Feb-01 00:00:02 UTC (GMT +0000) - id = 000000001, Status = failed 2008-Feb-01 00:10:02 UTC (GMT +0000) - id = 000000002, Status = succes +s 2008-Feb-01 00:20:02 UTC (GMT +0000) - id = 000000003, Status = failed 2008-Feb-01 00:30:02 UTC (GMT +0000) - id = 000000004, Status = succes +s 2008-Feb-01 00:40:02 UTC (GMT +0000) - id = 000000001, Status = succes +s 2008-Feb-01 00:50:02 UTC (GMT +0000) - id = 000000001, Status = succes +s 2008-Feb-01 01:00:02 UTC (GMT +0000) - id = 000000001, Status = failed 2008-Feb-01 01:10:02 UTC (GMT +0000) - id = 000000007, Status = succes +s 2008-Feb-01 01:20:02 UTC (GMT +0000) - id = 000000003, Status = failed 2008-Feb-01 01:30:02 UTC (GMT +0000) - id = 000000009, Status = succes +s [download] Output `ID last attempted attempts ============================================ 000000001 2008-Feb-01 00:00:02 1 000000003 2008-Feb-01 00:20:02 1` [download]	[reply] [d/l] [select]
Re^2: Logfile analysis : How to differentiate records with timestamp by tuakilan (Acolyte) on Mar 11, 2008 at 13:27 UTC
When i execute this script, i got the following error `$ perl -wc a300.pl Sequence (?<t...) not recognized in regex; marked by <-- HERE in m/ \A (?<t <-- HERE ime_stamp> \d{4}-\p{IsAlpha}{3}-\d\d [ ] \d\d:\d\d:\ +d\d ) .* [ ] - [ ] id [ ] = [ ] (?<id> \d+ ), [ ] Status [ ] = [ ] (?<status> \w+ ) / at a300.pl line 14.` [download]	[reply] [d/l]
Re^3: Logfile analysis : How to differentiate records with timestamp by hipowls (Curate) on Mar 11, 2008 at 20:17 UTC
I'm surprised it didn't exit with an error message similar to `Perl v5.10.0 required--this is only v5.8.8, stopped at Perl-1.pl line +6. BEGIN failed--compilation aborted at Perl-1.pl line 6.` [download] Ahh your error occurred on line 14 whereas the regex in my script is on line 16, I guess you removed the `use 5.010_000`; If you are running an older version of perl you'll need to remove the named captures & use regular captures. It's just removeing ?<name> and using numbered captures to initialize variables. You'll also need to replace the `say`s with `print ... "\n"`	[reply] [d/l]
Re^4: Logfile analysis : How to differentiate records with timestamp by tuakilan (Acolyte) on Mar 12, 2008 at 00:12 UTC