Check if Date interval contains Hour X

gulden has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I need your help to find the best algorithm for my task. The task is very simple, read lines form a file and check lines which the "Interval Date" contains a set of Hours (The Interval Date won't be greater than 24 Hours).

The file is in the form:

Start_Time         | End Time        | TEXT
2009-07-22 08:00:00|2009-07-22 08:00:00|blablalblabla
2009-07-22 01:00:00|2009-07-22 01:00:00|blablalblabla
2009-07-22 08:00:00|2009-07-22 21:00:00|blablalblabla
2009-07-22 23:00:00|2009-07-23 00:00:00|blablalblabla
2009-07-22 23:00:00|2009-07-23 02:00:00|blablalblabla
[download]

The "pseudo code" should be something like that:


my @hours (1, 11) ;  # Hours to check

open (FILE,"file.txt") or die "$!;";

<FILE>;  # Skip first line

while (<FILE>){  # Correction after [ack] comment
   chomp;
   my ($start_date,$stop_date,$text) = split '|';

   print "\nInterval $start_date - $stop_date\n";
   foreach my $hour(@hours){
     if( ($start_date to $stop_date) contains $hour){ # pseudo code
        print "Hour $hour: Match";
     }else{
        print "Hour $hour:Not Match";
     }
}

close(FILE);
[download]

The output for the sample file should be (corrected after graff comments):

Interval 2009-07-22 08:00:00 - 2009-07-22 08:00:00
Hour 1 :Not Match
Hour 11:Not Match

Interval 2009-07-22 01:00:00 - 2009-07-22 01:00:00
Hour 1 :Match
Hour 11:Not Match

Interval 2009-07-22 08:00:00 - 2009-07-22 21:00:00
Hour 1 :Not Match
Hour 11:Match

Interval 2009-07-22 23:00:00 - 2009-07-23 00:00:00
Hour 1 :Not Match
Hour 11:Not Match


Interval 2009-07-22 23:00:00 - 2009-07-23 02:00:00
Hour 1 :Match
Hour 11:Not Match
[download]

Any help will be helpful.

ŤA contentious debate is always associated with a lack of valid arguments.ť

Comment on Check if Date interval contains Hour X Select or Download Code

Replies are listed 'Best First'.
Re: Check if Date interval contains Hour X by ikegami (Patriarch) on Jul 23, 2009 at 18:58 UTC
use strict; use warnings; use DateTime qw( ); use DateTime::Format::Strptime qw( ); # Returns true if range ($dt_s, $dt_e) # spans any part of any of the @hours. sub hour_in_range { my ($dt_s, $dt_e, @hours) = @_; # Find the floor. ( $dt_s = $dt_s->clone() ) ->truncate( to => 'hour' ); for my $hour (@hours) { ( my $dt = $dt_s->clone() ) ->set_hour($hour); $dt->add( days => 1 ) if $dt < $dt_s; return 1 if $dt <= $dt_e; } return 0; } { my @hours = (1, 11); # Hours to check my $parser = new DateTime::Format::Strptime( pattern => '%Y-%m-%d %H:%M:%S', time_zone => 'local', ); <DATA>; # Skip first line while (<DATA>) { chomp; my ($ts_s, $ts_e, $text) = split /\\|/; my $dt_s = $parser->parse_datetime( $ts_s ); my $dt_e = $parser->parse_datetime( $ts_e ); if (hour_in_range($dt_s, $dt_e, @hours)) { print "Match\n"; } else { print "Not Match\n"; } } } __DATA__ Start_Time \|End Time \|TEXT 2009-07-22 08:00:00\|2009-07-22 08:00:00\|blablalblabla 2009-07-22 01:00:00\|2009-07-22 01:00:00\|blablalblabla 2009-07-22 08:00:00\|2009-07-22 21:00:00\|blablalblabla 2009-07-22 23:00:00\|2009-07-23 00:00:00\|blablalblabla 2009-07-22 23:00:00\|2009-07-23 02:00:00\|blablalblabla [download]	[reply] [d/l]
Re^2: Check if Date interval contains Hour X by alexm (Chaplain) on Jul 23, 2009 at 21:40 UTC
Another (maybe simpler) way of truncating: `sub hour_in_range { my ($dt_s, $dt_e, @hours) = @_; for my $hour (@hours) { my $dt = $dt_s->clone() ->truncate( to => 'day' ) ->add( hours => $hour ); $dt->add( days => 1 ) if $dt < $dt_s; return 1 if $dt <= $dt_e; } return 0; }` [download]	[reply] [d/l]
Re^3: Check if Date interval contains Hour X by ikegami (Patriarch) on Jul 23, 2009 at 21:50 UTC
That introduced a bug. hour_in_range no longer performs as documented. Test case: `2010-10-10 01:59:59\|2010-10-10 01:59:59\|blablalblabla` [download]	[reply] [d/l]
Re^4: Check if Date interval contains Hour X by alexm (Chaplain) on Jul 23, 2009 at 22:41 UTC
Re: Check if Date interval contains Hour X by alexm (Chaplain) on Jul 23, 2009 at 18:42 UTC
I'd suggest taking a look into DateTime::Span.	[reply]
Re^2: Check if Date interval contains Hour X by ikegami (Patriarch) on Jul 23, 2009 at 19:21 UTC
I thought you were on to something until I realized the real problem is finding the right date to go with the hour. Using ::Span makes things a lot more complicated since you now need to make 4 DateTime objects for every hour.	[reply]
Re^3: Check if Date interval contains Hour X by alexm (Chaplain) on Jul 23, 2009 at 20:08 UTC
The very first thing that came to my mind after reading the OP was the method `DateTime::Span::contains`. However, yours is the right approach since that method only works for sets that are fully inside, as the manual says. Thanks!	[reply] [d/l]
Re^4: Check if Date interval contains Hour X by ikegami (Patriarch) on Jul 23, 2009 at 20:26 UTC
Re^5: Check if Date interval contains Hour X by alexm (Chaplain) on Jul 23, 2009 at 21:33 UTC
Some notes below your chosen depth have not been shown here
Re: Check if Date interval contains Hour X by graff (Chancellor) on Jul 24, 2009 at 03:31 UTC
Um... according to your pseudocode, there should be 10 lines of output, because you seem to want to output one line for each of your two "hour" values tested against each of the 5 lines of input data. And in that regard, wouldn't it be helpful for the output to mention which of the "hour" values was being tested each time, and what time span it was being tested against? As for the overall approach, I'm with ikegami (with some minor alterations): in order to make this workable in a general way, what you really want is a subroutine that takes three args: a targeted hour, and the start and end date/time values to test. A little sanity checking on the data would be worthwhile as well, and in case it counts for anything, there are some short-cuts you can take advantage of... #!/usr/bin/env perl use strict; use Date::Calc qw/Date_to_Time/; sub hour_in_span { my ( $hr, $bgn, $end ) = @_; return unless ( $hr =~ /^\d{1,2}$/ and $hr >= 0 and $hr <= 23 ); my $hr2 = sprintf( "%02d", $hr ); # make sure to use 2 digits for ( $bgn, $end ) { return unless ( /^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}$/ ); } return unless $bgn le $end; # require that args be in correct seq +uence # easiest case: start == end == hour of interest if ( $bgn eq $end ) { return ( $bgn =~ / $hr2:/ ); } # next easiest: span from start to end >= one full day # so it must include hour of interest my $ep_bgn = Date_to_Time( split /\D/, $bgn ); my $ep_end = Date_to_Time( split /\D/, $end ); return 1 if (( $ep_end - $ep_bgn ) / ( 24 * 60 * 60 ) >= 1 ); # hardest case: # -- plug hour of interest into each endpoint of the span # and see if either resulting time stamp falls within the span ( my $test_bgn = $bgn ) =~ s/ \d{2}:/ $hr2:/; ( my $test_end = $end ) =~ s/ \d{2}:/ $hr2:/; my $test_ep_bgn = Date_to_Time( split /\D/, $test_bgn ); my $test_ep_end = Date_to_Time( split /\D/, $test_end ); return (( $ep_bgn <= $test_ep_bgn and $ep_end >= $test_ep_bgn ) or ( $ep_bgn <= $test_ep_end and $ep_end >= $test_ep_end )); } ## End of algorithm ## -- from here on down, we're just testing it my @hours = ( 1, 11 ); while (<DATA>) { next unless ( /^\d{4}-/ ); my ( $bgn, $end ) = split /\\|/; for my $hr ( @hours ) { my $in = hour_in_span( $hr, $bgn, $end ); if ( !defined( $in )) { print "$bgn -- $end / $hr : bad data\n"; } elsif ( $in ) { print "$bgn -- $end / contains $hr\n"; } else { print "$bgn -- $end / does NOT contain $hr\n"; } } } __DATA__ Start_Time \| End Time \| TEXT 2009-07-22 08:00:00\|2009-07-22 08:00:00\|blablalblabla 2009-07-22 01:00:00\|2009-07-22 01:00:00\|blablalblabla 2009-07-22 08:00:00\|2009-07-22 21:00:00\|blablalblabla 2009-07-22 23:00:00\|2009-07-23 00:00:00\|blablalblabla 2009-07-22 23:00:00\|2009-07-23 02:00:00\|blablalblabla [download] (BTW: you could relax the constraint on having the subroutine args in a specified order. So long as the "targeted hour" arg is in the right place, the other two args are interchangeable; just use the lower value as $bgn and the higher one as $end.)	[reply] [d/l]
Re^2: Check if Date interval contains Hour X by gulden (Monk) on Jul 24, 2009 at 09:42 UTC
ŤUm... according to your pseudocode, there should be 10 lines of output, because you seem to want to output one line for each of your two "hour" values tested against each of the 5 lines of input data. And in that regard, wouldn't it be helpful for the output to mention which of the "hour" values was being tested each time, and what time span it was being tested against?ť u r right.	[reply]
Re: Check if Date interval contains Hour X by ack (Deacon) on Jul 24, 2009 at 04:46 UTC
I'm curious, you open the file to filehandle `FILE` but then, after you've skipped the first line, you read from filehandle `LINE` in your `while (<LINE>)` block. Is that a typo? Should the `while` statement be reading from `FILE`? ack Albuquerque, NM	[reply] [d/l] [select]
Re^2: Check if Date interval contains Hour X by gulden (Monk) on Jul 24, 2009 at 09:26 UTC
Should be: `while (<FILE>)` [download]	[reply] [d/l]
Re: Check if Date interval contains Hour X by gulden (Monk) on Jul 24, 2009 at 10:08 UTC
Thank u all, for the excellent code/comments that were made.	[reply]