Comparing Dates and Reoccurance

tuakilan has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Comparing Dates and Reoccurance by NetWallah (Canon) on Mar 12, 2008 at 05:12 UTC
This should do the trick. Formatting, printing headers, redoing the I/O from the file is left as an excercise. use strict; use Time::Local; my %track; while (<DATA>){ my ($date,$ignoreIDLiteral,$id) = split / - \| = /; chomp $id; my $time = dateconv($date); my $prevtime = $track{$id}{TIME}; $track{$id}{TIME}=$time; $track{$id}{DATE}=$date; $track{$id}{COUNT}++; print "$id\t$date\t$track{$id}{COUNT}\n" if $prevtime and $time - $prevtime > 3600; } sub dateconv{ my $d = shift; my %month = qw[jan 1 feb 2 mar 3 apr 4 may 5 jun 6 jul 7 aug 8 sep 9 oct 10 nov 11 dec 12]; my @p = $d=~/(\d+)-(\w+)-(\d+)\s(\d+):(\d+):(\d+)/; $p[1]=$month{ lc $p[1] } - 1; return timelocal(@p[5,4,3,2,1,0]); #timelocal($sec,$min,$hour,$mday,$mon,$year); } __DATA__ 2007-Nov-07 00:00:00 - id = 000000001 2007-Nov-07 00:30:01 - id = 000000002 2007-Nov-07 00:40:00 - id = 000000003 2007-Nov-07 01:20:01 - id = 000000001 [download] prints: 000000001 2007-Nov-07 01:20:01 2 "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom	[reply] [d/l]
Re^2: Comparing Dates and Reoccurance by tuakilan (Acolyte) on Mar 12, 2008 at 07:31 UTC
Hi Netwallah, I added to your code with the following when i run the code, it does not however, produce the desired output. #!/usr/local/bin/perl -w use strict; use warnings; use Time::Local; my $infile = 'input.2008-01-01.log'; my $outfile = 'output.2008-01-01.log'; my($fh_out, $fh); open($fh_out, '>', $outfile) or die "Could not open outfile: $!"; open($fh, '<', $infile) or die "Could not open logfile: $!"; my %track; while (<$fh>){ my ($date,$ignoreIDLiteral,$id) = split / - \| = /; chomp $id; my $time = dateconv($date); my $prevtime = $track{$id}{TIME}; $track{$id}{TIME}=$time; $track{$id}{DATE}=$date; $track{$id}{COUNT}++; print "$id\t$date\t$track{$id}{COUNT}\n" if $prevtime and $time - $prevtime > 3600; } sub dateconv{ my $d = shift; my %month = qw[jan 1 feb 2 mar 3 apr 4 may 5 jun 6 jul 7 aug 8 sep 9 oct 10 nov 11 dec 12]; my @p = $d=~/(\d+)-(\w+)-(\d+)\s(\d+):(\d+):(\d+)/; $p[1]=$month{ lc $p[1] } - 1; return timelocal(@p[5,4,3,2,1,0]); #timelocal($sec,$min,$hour,$mday,$mon,$year); } close $fh_out; close $fh; [download]	[reply] [d/l]
Re^3: Comparing Dates and Reoccurance by NetWallah (Canon) on Mar 12, 2008 at 15:46 UTC
You are opening and closing $fh_out, but you are not WRITING to it. Do you see anything on STDOUT ? If the file format is as you said in the initial post, this has been tested and it works. However, it is fragile, and even the slightest difference in format will throw it off. I would suggest learning how to debug the program, stepping through each statement, and checking the values. "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom	[reply]
Re: Comparing Dates and Reoccurance by Narveson (Chaplain) on Mar 12, 2008 at 05:35 UTC
The algorithm won't be hard if you can say what you're aiming for. Doesn't have to be a formal statement of requirements, an example is fine as long as it covers the obvious questions. Do you want IDs 000000002 and 000000003 in the output? What is the Occurance and why is it 1? I'll go ahead and assume the input log is sorted by timestamp and in fixed-width format. `# see manpage for unpack my $TEMPLATE = 'A20 @28A9'; # read the input log into a hash my %last_time; while (<DATA>) { my ($timestamp, $id) = unpack $TEMPLATE; $last_time{$id} = $timestamp; } # print the output log # I have omitted the header for my $id (sort keys %last_time) { print "$id\t$last_time{$id}\n"; }` [download]	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Comparing Dates and Reoccurance by apl (Monsignor) on Mar 12, 2008 at 09:51 UTC
Time::Simple lets you subtract two times, returning the difference in seconds. You can also search at specifying Time.	[reply]
Re: Comparing Dates and Reoccurance by wade (Pilgrim) on Mar 12, 2008 at 23:20 UTC
If I understand the problem correctly, I'd make an array of IDs where each element is a hash containing a DateTime (a module you can get from CPAN) and an occurrence count. Then, when you iterate through the input file, you can check the date from the input with the date in the array (at the appropriate ID). The code might look something like this (I haven't tried any of this -- it's off the top of my pointy little head): use strict; use warnings; use DateTime; my @earliestEvent; open LOGFILE, "<", $filename \|\| die "..."; while (my $input = <LOGFILE>) { my ($dateString, $id) = split / - id = /, $input; my $date = DateTime->new( # use one of the constructors to # fill the date ); if (!exists($earliestEvent[$id])) { $earliestEvent[$id] = {}; $earliestEvent[$id]->{"count"} = 1; $earliestEvent[$id]->{"date"} = $date; } else { my $hourBoundary = $earliestEvent[$id]->{"date"}; $hourBoundary->add(hour=>1); if ($date > $hourBoundary) { print OUTFILE "$id\t" . $earliestEvent[$id]->{"date"}->datetime . "\t" . $earliestEvent[$id]->{"count"} . "\n"; $earliestEvent[$id]->{"count"} = 1; $earliestEvent[$id]->{"date"} = $date; } else { ++{$earliestEvent[$id]->{"count"}}; } } } # then, of course, you'll need to print the remaining ones [download] Note: DateTime has some idiosyncrasies. You'll probably, for example, want to use the Floating time zone. Does that work for you?	[reply] [d/l]