comment on

Hi Monks,

This is to follow up from on http://www.perlmonks.org/?node_id=673673 where i posted questions on how to deal with dates and re occurrences

for a newbie like me i am trying to finish an assignment and i am pulling my hairs :(

The task

read a raw ASCII log file which was collected by a toll collecting machine.

from the log file, using "tids" and "channel" as the key, locate records that are longer than 3600 seconds.

record down how many times such incident happened and identify it as 'occurrences'

output the result in the order as shown in 'report-2007-01-01.txt'.

in SQL statement, it look similar to this

select * from
where channel = seven
and time > 3600 seconds
commit;

Exact raw ASCII logfile from toll collecting machine, tollog-2007-jan-01.txt

2008-Jan-01 00:00:00 UTC (GMT +0000) - Toll: channel = seven, ref = xx
+x.xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 123456789
2008-Jan-01 00:10:00 UTC (GMT +0000) - Toll: channel = six, ref = xxx.
+xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 987654321
2008-Jan-01 00:20:00 UTC (GMT +0000) - Toll: channel = three, ref = xx
+x.xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 223344221
2008-Jan-01 00:30:00 UTC (GMT +0000) - Toll: channel = four, ref = xxx
+.xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 998829992
2008-Jan-01 00:40:00 UTC (GMT +0000) - Toll: channel = three, ref = xx
+x.xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 938874724
2008-Jan-01 00:50:00 UTC (GMT +0000) - Toll: channel = two, ref = xxx.
+xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 229928828
2008-Jan-01 01:00:00 UTC (GMT +0000) - Toll: channel = five, ref = xxx
+.xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 998822992
2008-Jan-01 01:10:00 UTC (GMT +0000) - Toll: channel = seven, ref = xx
+x.xxxxxx.xxx.xxxxx.xxxxxxx.xxxxxxxxxxxxxxxxxxxxx, tids = 123456789
[download]

As you can see from the above, record 1 and 8 are the output which are desired as these 2 records has the same channel name and tids number.

Desired report file : report-2007-01-01.txt

TIDS               time                    Occurance
====================================================
123456789          2008-Jan-01 01:10:00     2
[download]

So far what i did was the following but it wrote a zero byte size file :(

#!/usr/local/bin/perl -w
use strict;
use warnings;
use Time::Local;
my $infile = 'input.2008-01-01.log';
my $outfile = 'output.2008-01-01.log';
my($fh_out, $fh);
open($fh_out, '>', $outfile) or die "Could not open outfile: $!";
open($fh, '<', $infile) or die "Could not open logfile: $!";
my %track;
while (<$fh>){
  my ($date,$ignoreIDLiteral,$id) = split / - | = /;
  chomp $id;
  my   $time = dateconv($date);
  my $prevtime = $track{$id}{TIME};
  $track{$id}{TIME}=$time;
  $track{$id}{DATE}=$date;
  $track{$id}{COUNT}++;
  print "$id\t$date\t$track{$id}{COUNT}\n"
      if $prevtime and $time - $prevtime > 3600;

}
sub dateconv{
  my $d = shift;
  my %month = qw[jan 1 feb 2 mar 3 apr 4 may 5 jun 6 jul 7
                 aug 8 sep 9 oct 10 nov 11 dec 12];
  my @p = $d=~/(\d+)-(\w+)-(\d+)\s(\d+):(\d+):(\d+)/;
  $p[1]=$month{ lc $p[1]  } - 1;
  return  timelocal(@p[5,4,3,2,1,0]);
#timelocal($sec,$min,$hour,$mday,$mon,$year);
}
close $fh_out;
close $fh;
[download]

I think I messed up with the regex of the incoming logfile. Anyone can correct me where i did wrong ?

In reply to Comparing Dates and Reoccurance - Part II by tuakilan

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.