Ovid has asked for the wisdom of the Perl Monks concerning the following question:
I've searched through CPAN for an appropriate module, but to no avail. Assuming that I have a hash whose keys are in YYYY-MM format, I need to find the gaps in the date ranges and return those gaps. I've written some code that appears to work, but it seems like overkill. Boundary conditions are a nightmare and the logic gets a bit funky in places.
I'm posting the code, in case an alternative doesn't exist, but I'm hoping to find a simpler solution. I think my solution is as simple as I can reasonably make it, but a module that's been tested more robustly would be much appreciated.
Side note: &find_gaps should return a reference to an array containing all dates that should be there, but aren't.
use strict;
use warnings;
use Data::Dumper;
my @temp = ( [ qw( testthis 2000-4 2000-2 2000-1 2000-12 2001-1 2002-1
+2 1999-12 ) ], # bad
[ qw( testthis 1999-1 1999-2 1999-3 1999-4 1999-5 1999-6
+1999-7 1999-8 1999-9 1999-10 1999-11 1999-12 2001-3 ) ], # bad
[ qw( 1999-1 1999-2 1999-3 1999-4 1999-5 1999-6 1999-7 19
+99-8 1999-9 1999-10 1999-11 1999-12 2000-1 ) ], # good
[ qw( 1998-12 1999-1 ) ], # good
[ qw( 2004-1 ) ], # good
[ qw( 2004-12 ) ], # good
[ qw( 2004-12 2004-11 ) ], # good
[ qw( testthis 2004-12 2004-9 2004-11 ) ] ); # bad
foreach ( @temp ) {
my %dates = map { $_, '' } @$_;
if ( exists $dates{ 'testthis' } ) {
delete $dates{ 'testthis' };
print Dumper( \%dates );
my ( $missing ) = find_gaps( \%dates );
print Dumper( $missing ), '-' x 20 ,"\n";
}
}
#
# Above is my test code. Below is the code in question
#
sub find_gaps {
my $dates = shift;
my %year;
# Create hash with years as keys and values being a list with
# first month and last month in that year range
foreach my $date ( keys %$dates ) {
my $this_year = substr $date, 0, 4;
next if exists $year{ $this_year };
$year{ $this_year } = get_range( $this_year, \%$dates );
}
my @years = sort keys %year;
my $gap = 0;
my ( @missing, @present );
# Year span is greater than number of years
if ( scalar @years > 1 and ( ( $years[ -1 ] - $years[ 0 ] ) ) != (
+ scalar @years ) - 1 ) {
for my $year ( $years[ 0 ] .. $years[ -1 ] ) {
if ( ! grep { /$year/ } @years ) {
for ( 1 .. 12 ) {
push @missing, "$year-$_";
} # next month
}
} # next year
}
# Gap in months?
for my $year_index ( 0 .. $#years ) {
my $curr_year = $years[ $year_index ];
my $first_month = $year{ $curr_year }->[ 0 ];
my $last_month = $year{ $curr_year }->[ 1 ];
next if $year_index == 0 and $first_month == 12;
next if $year_index == $#years and $first_month == 1 and ! $la
+st_month;
my ( $first, $last );
# first year, but more than one year
if ( $year_index == 0 and $#years != 0 ) {
( $first, $last ) = ( $first_month, 12 ); # first year
# first and only year
} elsif ( $year_index == 0 ) {
( $first, $last ) = ( $first_month, $last_month ? $last_mo
+nth : $first_month );
#last year
} elsif ( $year_index == $#years ) {
( $first, $last ) = ( 1, $last_month ? $last_month : $firs
+t_month ); # last year
# intervening years
} else {
( $first, $last ) = ( 1, 12 );
}
for my $month ( $first .. $last ) {
if ( ! exists $dates->{ $curr_year."-".$month } ) {
push @missing, $curr_year."-".$month;
}
} # next $month
} # next $year_index
return \@missing;
}
sub get_range {
my ( $year, $dates ) = @_;
my @months;
# If the %date has the same year, grab the month
@months = sort { $a <=> $b }
map { substr( $_, 5 ) }
grep { substr( $_, 0, 4 ) == $year } keys %$dates;
# Return range if more than one month, else return the month
$#months > 0 ? [ $months[ 0 ], $months[ -1 ] ] : [ $months[ 0 ], 0
+ ];
}
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.
Re: Finding gaps in date ranges
by MeowChow (Vicar) on May 11, 2001 at 00:51 UTC
|
I would approach this problem with a conversion, from date format, to a more easily dealt-with numeric format, and vice-versa:
sub find_gaps {
my ($pdate, @dates) = sort {$a <=> $b}
map { my ($y,$m) = split /-/; $y*12 + $m - 1
+}
keys %{+pop};
my @skips;
foreach (@dates) {
push @skips, $pdate+1 .. $_-1;
$pdate = $_;
}
[ map { int ($_/12) . "-" . (1 + $_%12) } @skips ];
}
UPDATE: couldn't help golfing things a bit...
MeowChow
s aamecha.s a..a\u$&owag.print | [reply] [Watch: Dir/Any] [d/l] |
Re (tilly) 1: Finding gaps in date ranges
by tilly (Archbishop) on May 11, 2001 at 00:43 UTC
|
An old tactic when faced with a representation that has a lot of boundary cases. Convert to and from a representation that is easier to work with, like so:
use strict;
use Carp;
# Takes a list of months in yyyymm format, returns the gaps in sorted
+order.
sub find_gaps {
my @dates = sort {$a <=> $b} map {yyyy_mm2mon_num($_)} @_;
my $last_mon_count = shift @dates;
my @gaps;
foreach my $mon_count (@dates) {
foreach my $missed (($last_mon_count + 1)..($mon_count-1)) {
push @gaps, mon_num2yyyy_mm($missed);
}
$last_mon_count = $mon_count;
}
return @gaps;
}
# Converts a count of months back into yyyymm format
sub mon_num2yyyy_mm {
my $mon_num = shift;
use integer;
my $yyyy = $mon_num / 12;
my $mm = 1 + $mon_num % 12;
# Sanity check
unless (1900 < $yyyy and $yyyy < 2200) {
carp("Month number '$mon_num' gives a year of $yyyy which seems st
+range");
}
return "$yyyy-$mm";
}
# Converts yyyymm format into a count of months from year 0. (Which d
+idn't
# exist, shoot me. It is a convenient base-point for calculations.)
sub yyyy_mm2mon_num {
my $date = shift;
if ($date =~ /(\d{4}).*?(\d\d?)/) {
return 12*$1 + $2 - 1;
}
else {
confess("Date '$date' is not in yyyymm format?");
}
}
| [reply] [Watch: Dir/Any] [d/l] |
|
The reference to @_ in find_gaps should actually be keys %{ $_[0] }. From there, I ran it with the following dates (first line of dates from my test suite):
2000-4 2000-2 2000-1 2000-12 2001-1 2002-12 1999-12
The output:
$VAR1 = '2000-3';
It found the first gap but not the rest.
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats. | [reply] [Watch: Dir/Any] [d/l] [select] |
|
# The test code:
print map "$_\n", find_gaps(
qw(2000-4 2000-2 2000-1 2000-12 2001-1 2002-12 1999-12)
);
# The output
2000-3
2000-5
2000-6
2000-7
2000-8
2000-9
2000-10
2000-11
2001-2
2001-3
2001-4
2001-5
2001-6
2001-7
2001-8
2001-9
2001-10
2001-11
2001-12
2002-1
2002-2
2002-3
2002-4
2002-5
2002-6
2002-7
2002-8
2002-9
2002-10
2002-11
| [reply] [Watch: Dir/Any] [d/l] |
Re: Finding gaps in date ranges
by Anonymous Monk on May 11, 2001 at 01:11 UTC
|
I'm a big fan of peanos axioms. So
I started
with the idea: if it weren't dates we were dealing with,
but natural numbers, then I would just sort the numbers,
and compute the gaps with one simple loop.
I decided to convert you date-format to something
that can be sorted numerically (YYYYMM, with month in two
digits, like "02"), and I made up a function
"nextmonth" for incrementing the date.
sub nextmonth {
my($year, $month);
$year = substr($_[0], 0, 4);
$month = substr($_[0], 4, 2);
$month++;
while ($month > 12) {
$year++;
$month-=12;
}
return $year . sprintf("%02d", $month);
}
sub find_gaps {
my $dates = shift;
my(@missing, @keys, $last);
# convert your date-format to mine
@keys = map { s/-(.)$/0$1/; s/-//; $_ } keys %$dates;
# my format can be sorted numerically
@keys = sort @keys;
# in the loop we need two variables:
# $last for the date before the current one,
# and $date for the current date
$last = shift(@keys);
foreach my $date ( @keys ) {
while (1) {
$last = nextmonth($last);
last if $last eq $date;
push(@missing, $last);
}
$last = $date;
}
# convert my date-format back to yours
@missing =
map { s/0(.)$/$1/; s/^(....)/$1-/; $_ } @missing;
return \@missing;
}
share and enjoy
| [reply] [Watch: Dir/Any] [d/l] |
|
boohoo, the above article is mine, but I posted it anonymously by mistake. oh higher powers, if it please you,
transfer the XPs of the article to this humble scribe.
I'm a big fan of peanos
axioms. So I started with the idea: if it weren't dates we were dealing with, but
natural numbers, then I would just sort the numbers, and compute the gaps with one simple loop.
I decided to convert you date-format to something that can be sorted numerically (YYYYMM, with
month in two digits, like "02"), and I made up a function "nextmonth" for incrementing the date.
sub nextmonth {
my($year, $month);
$year = substr($_[0], 0, 4);
$month = substr($_[0], 4, 2);
$month++;
while ($month > 12) {
$year++;
$month-=12;
}
return $year . sprintf("%02d", $month);
}
sub find_gaps {
my $dates = shift;
my(@missing, @keys, $last);
# convert your date-format to mine
@keys = map { s/-(.)$/0$1/; s/-//; $_ } keys %$dates;
# my format can be sorted numerically
@keys = sort @keys;
# in the loop we need two variables:
# $last for the date before the current one,
# and $date for the current date
$last = shift(@keys);
foreach my $date ( @keys ) {
while (1) {
$last = nextmonth($last);
last if $last eq $date;
push(@missing, $last);
}
$last = $date;
}
# convert my date-format back to yours
@missing =
map { s/0(.)$/$1/; s/^(....)/$1-/; $_ } @missing;
return \@missing;
}
share and enjoy
--
Brigitte 'I never met a chocolate I didnt like' Jellinek
http://www.horus.com/~bjelli/ http://perlwelt.horus.at | [reply] [Watch: Dir/Any] [d/l] |
Re: Finding gaps in date ranges
by cLive ;-) (Prior) on May 11, 2001 at 00:52 UTC
|
#!/usr/bin/perl
use strict;
my @temp = qw(2000-4 2000-2 2000-1 2000-12 2001-1 2002-12 1999-12);
my @dates = (sort {
my @a_date = split '-', $a;
my @b_date = split '-', $b;
$a_date[0] <=> $b_date[0]
||
$a_date[1] <=> $b_date[1];
} @temp);
my @start_date = split '-', shift @dates;
my @end_date = split '-', pop @dates;
# use mod 12 to make job easy
my $start_count = $start_date[0]*12 + $start_date[1]-1;
my $end_count = $end_date[0]*12 + $end_date[1]-1;
# hash of dates in this time
my %date_hash;
# build hash
for ($start_count..$end_count) {
# here's the sneaky trick :)
my $year = int($_/12);
my $month = $_%12 + 1;
$date_hash{$year}{$month}++;
}
# now delete elements that exist
for (@temp) {
/(\d+)-(\d+)/;
delete $date_hash{$1}{$2};
}
# print missing
print "Missing dates are\n\n";
my $count=1;
foreach my $year (sort {$a <=> $b} keys %date_hash) {
for (sort {$a <=> $b} keys %{$date_hash{$year}}) {
printf("%4d-%2d ",$year,$_);
print "\n" unless $count%4; $count++;
}
}
cLive ;-) | [reply] [Watch: Dir/Any] [d/l] |
Re: Finding gaps in date ranges
by suaveant (Parson) on May 11, 2001 at 00:30 UTC
|
What about something like...
my $error = 0;
my @tmp = sort datesort qw( 2000-4 2000-2 2000-1 2000-12 2001-1 2002-1
+2 1999-12 );
my($year,$month) = split '-', $tmp[0];
for(my $i = 1; $i < @tmp; $i++) {
if(++$month == 13) {
$month = 1;
$year++;
}
if($tmp[$i] ne "$year-$month") {
print "ERROR! gap at '$year-$month', '$tmp[$i]' found!\n";
$error = 1;
last;
}
}
print "Success\n" unless $error;
sub datesort {
my($a1,$a2) = split '-', $a;
my($b1,$b2) = split '-', $b;
$a1 <=> $b1 or $a2 <=> $b2;
}
Tested on the bad and good arrays, you should be able to modify it
for your needs easily enough...
it would be shorter if your dates were zero padded.
Update on afterthought I should say what it does...
it sorts the date array (could be an array of hash keys), it
then starts at the first, or earliest one, it goes through
the array adding one to the month and wrapping the year and setting month to 1 if
the month hits 13. it goes till it finds a date it didnt expect, or
gets to the end of the array sucessfully. Pretty simple,
needed the special sort since months were not zero padded.
- Ant | [reply] [Watch: Dir/Any] [d/l] |
|
ERROR! gap at '2000-3', '2000-4' found!
However, gaps exist at 2000-5 through 2000-11 and 2001-2 through 2002-11, yet were not reported. Did I misunderstand something about your code or how to use it?
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats. | [reply] [Watch: Dir/Any] [d/l] |
|
my $error = 0;
my @gaps;
my @tmp = sort datesort qw( 2000-4 2000-12 2001-1 2002-12 1999-12 );
my($year,$month) = split '-', $tmp[0];
for(my $i = 1; $i < @tmp; $i++) {
my $gap = 0;
my @range;
do {
$range[1] = "$year-$month";
if(++$month == 13) {
$month = 1;
$year++;
}
if($tmp[$i] ne "$year-$month") {
$range[0] = "$year-$month" unless $gap;
$gap++;
$error = 1;
}
if($gap > 100) {
print "ERROR! infinite loop\n";
last;
}
} until($tmp[$i] eq "$year-$month");
if($gap) {
push @range, "$year-$month";
push @gaps, \@range;
}
}
if($error) {
for(@gaps) {
print "Gap from $_->[0] to $_->[1]\n";
}
}
print "Success\n" unless $error;
sub datesort {
my($a1,$a2) = split '-', $a;
my($b1,$b2) = split '-', $b;
$a1 <=> $b1 or $a2 <=> $b2;
}
This stores the ranges...
- Ant | [reply] [Watch: Dir/Any] [d/l] |
|
Yeah, just realized that... easy fix... one second...
- Ant
| [reply] [Watch: Dir/Any] |
Re: Finding gaps in date ranges
by Sifmole (Chaplain) on May 11, 2001 at 01:05 UTC
|
I guess I will throw mine in as well...
Caveat, your response list will have 0 padded months, but that would be easy enough to correct.... Matter of fact I will, just give me a minute... :)
UpdateRemoved 0 padding in response list.
use strict;
use Data::Dumper;
my @temp = ( [ qw( testthis 2000-4 2000-2 2000-1 2000-12 2001-1 2002-1
+2 1999-12 ) ], # bad
[ qw( testthis 1999-1 1999-2 1999-3 1999-4 1999-5 1999-6
+1999-7 1999-8 1999-9 1999-10 1999-11 1999-12 2001-3 ) ], # bad
[ qw( 1999-1 1999-2 1999-3 1999-4 1999-5 1999-6 1999-7 19
+99-8 1999-9 1999-10 1999-11 1999-12 2000-1 ) ], # good
[ qw( 1998-12 1999-1 ) ], # good
[ qw( 2004-1 ) ], # good
[ qw( 2004-12 ) ], # good
[ qw( 2004-12 2004-11 ) ], # good
[ qw( testthis 2004-12 2004-9 2004-11 ) ] ); # bad
foreach ( @temp ) {
my %dates = map { s/-(\d)$/-0$1/; $_, '' } @$_;
if ( exists $dates{ 'testthis' } ) {
delete $dates{ 'testthis' };
print Dumper( \%dates );
my ( $missing ) = find_gaps( \%dates );
print Dumper( $missing ), '-' x 20 ,"\n";
}
}
sub find_gaps {
my $dates = shift;
my @date_list = sort keys %$dates;
my $c_date = shift @date_list;
inc_date(\$c_date);
my @gaps = map { trap_gap(\$c_date, $_); } @date_list;
s/-0(\d)$/-$1/ for @gaps;
return \@gaps;
}
sub inc_date {
my $date = shift;
my ($y, $d) = split('-', $$date);
if ($d == 12) {
$y++; $d = 01;
} else {
$d++;
}
$d = "0$d" if $d < 10;
$$date = "${y}-${d}";
}
sub trap_gap {
my ($c, $t) = @_;
my @res = ();
if ($$c eq $t) {
inc_date($c);
return ();
}
while ($$c ne $t) {
push(@res, $$c);
inc_date($c);
}
inc_date($c);
return @res;
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: Finding gaps in date ranges
by perlmonkey (Hermit) on May 11, 2001 at 05:47 UTC
|
Instead of trying to find out what you are missing, it is
a lot easier (and faster) to just exclude what you have.
sub find_gaps {
my $dates = shift;
my @dates = sort{ $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] }
map{[substr($_, 0, 4), substr($_, 5)]} keys %$dates;
my( $minY,$minM,$maxY,$maxM ) = (@{$dates[0]}, @{$dates[-1]});
my @gaps;
for my $y ($minY .. $maxY) {
exists $dates->{"$y-$_"} ? next : push @gaps, "$y-$_"
for ($minY == $y ? $minM : 1) .. ($maxY == $y ? $maxM : 12);
}
return \@gaps;
}
This code sorts to finds the min and max of the date range, then
loops over the entire range in order looking to see if
we have that date already. If we dont have the date then
keep it to return. | [reply] [Watch: Dir/Any] [d/l] |
(dws)Re: Finding gaps in date ranges
by dws (Chancellor) on May 11, 2001 at 02:54 UTC
|
Here's a quick-and-dirty way to find gaps by counting the number of dates you know, then advancing from the first one, recording gaps until you've seen all the known dates there are to see. It assumes no leading zeros in months, and handles the edge cases of being handed an empty list, or a list with one date in it. Note the bruce-force determination of the initial date in the range.
sub find_gaps {
my $dates = shift;
my $ndates = scalar keys %$dates;
my @missing = ();
return \@missing if $ndates == 0;
# Determine the first date in the range
my $year = substr((sort keys %$dates)[0],0,4);
my $month = 1;
while ( ! exists $$dates{ "$year-$month" } ) {
do {$month = 1; ++$year} if ++$month > 12;
}
--$ndates;
# Look for gaps until we've seen each date in the range
while ( $ndates ) {
do { $month = 1; ++$year } if ++$month > 12;
if ( exists $$dates{"$year-$month"} ) {
--$ndates;
} else {
push @missing, "$year-$month";
}
}
return \@missing;
}
| [reply] [Watch: Dir/Any] [d/l] |
Re: Finding gaps in date ranges
by mr.nick (Chaplain) on May 11, 2001 at 08:02 UTC
|
#!/usr/bin/perl -w
use strict;
sub find_gaps {
my $dates=shift;
my @dates=sort map { sprintf "%04d-%02d",split/-/ } keys %$dates;
my ($c,$l)=($dates[0],$dates[$#dates]);
my @gaps;
while ($l cmp $c) {
my $r=sprintf("%d-%d",split(/-/,$c));
push @gaps,$r unless defined $dates->{$r};
substr($c,5,2)++;
if (substr($c,5,2)==13) {
substr($c,5,2)='01';
substr($c,0,4)++;
}
}
\@gaps;
}
my %dates=map { $_,'' } qw ( 2000-4 2000-2 2000-1 2000-12 2001-1 2002-
+12 1999-12 );
my $gaps=find_gaps \%dates;
print "dates\n",join("\n",sort keys %dates),"\n";
print "gaps\n",join("\n",@$gaps),"\n";
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
I am absolutely baffled by MeowChow solutions. What the heck does +pop do?
Sorry {g} I should have put my golf clubs away :) The +pop simply removes the first argument from @_, which is the hash reference, and the %{...} around it dereference the hash ref. The + is there because the dereference by the enclosing %{...} is otherwise ambiguous. Without it, Perl would assume you are just referencing a hash variable named %pop, since the characters between the braces would all be [a-zA-Z0-9_ ]. Hmm... I hope that made some semblence of sense.
MeowChow
s aamecha.s a..a\u$&owag.print | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Finding gaps in date ranges
by Anonymous Monk on May 11, 2001 at 19:31 UTC
|
# assuming all dates are in %dates
use Time::Local;
my %missed = ();
my ($mon, $yr) = split /-/, (sort keys %dates)[0];
my $from = timelocal(1, 0, 0, 1, $mon-1, $yr);
my ($mon, $yr) = split /-/, (sort keys %dates)[-1];
my $to = timelocal(59, 59, 23, 1, $mon-1, $yr);
while ($from < $to){
@date = localtime($from);
$date[4]++; $date[5] += 1900; # well, you know...
# add to %missed if not in %dates
$missed{"$date[5]-$date[4]"}++
unless exists $dates{"$date[5]-$date[4]"};
$from += 60*60*24; # add one day
}
print "$_\n" for keys %missed;
--perlplexer | [reply] [Watch: Dir/Any] [d/l] |
Re: Finding gaps in date ranges
by birdbrane (Chaplain) on May 11, 2001 at 19:54 UTC
|
How about Date::Calc? This allows you to input two
different dates and get the difference. I used it for
a "Baby Pool" for before my daughter was born (family
members could bet on date, weight and gender. Here is
a snippet of what I did:
use Date::Calc qw(Delta_Days);
$day_diff{$name} = abs(Delta_Days(2000, $actual_month, $
actual_day, 2000, $guess_month, $guess_day));
From what I remember, the module was pretty flexible w/
how the date could be input.
Side Note: It appears that there are disagreeing
opinions about the module. Test it and see how it fits for
you...
bb | [reply] [Watch: Dir/Any] [d/l] |
|
|