Recursive method to check overlap of many date ranges

bigbot has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Recursive method to check overlap of many date ranges by Perlbotics (Archbishop) on Sep 17, 2011 at 09:51 UTC
I do not think that recursion is helpful here, though not impossible. First step seems to come up with a proper definition of average number of trades open at the same time (What time? How to deal with inactivity?). Maybe you can just re-sample your data like so: Find start time (`time_start`) of first trade. Find end time (`time_end`) of last trade. Define a suitable sample interval (`interval`, e.g 15s). From `time_start` to `time_end` in increments of `interval` do Count the number of open trades at sample point: `count[n x interval]` Generate mean value from sampled list (maybe ignore samples with zero count). Alternatively: Use a sliding window and reduce further. Potential refinement: Ignore some of the first and last samples (transients). Update: Instead of the synchronous sampling method suggested above, count parallel trades asynchronously at open/close times (events) - including/excluding the opened/closed trade respectively. You might weight the count with its duration (relative to observation period) to get an average value. Here, duration of count might differ from duration of given trade since it is the time span between adjacent events. Example: Trade#1: open=10:10/close=11:15/duration=01:05; Trade#2: open=10:12/close=10:18/duration=00:06; Trade#3: open=10:15/close=10:26/duration=00:11; `=Asynchronous Method= =Sampling Method= Time Action Count Duration Weight Count 15s-Samples ----------------------------------------------- ------------------- 10:10 open#1 1 2 2/65 1 8 10:12 open#2 2 3 3/65 2 12 10:15 open#3 3 3 3/65 3 12 10:18 close#2 2 8 8/65 2 32 10:26 close#3 1 49 49/65 1 196 11:15 close#1 0 - - - - ----------------------------------------------- -------------------` [download] Observation period: 10:10-11:15 = 65 (Minutes) = 260 15s-samples Asynchronous Method: Weighted sum (count * weight) = 82/65; Events = 6; Duration = 65; Average = 82/65 = 1.26 Sampling Method: Sum = 328; Samples = 260; Duration = 260; Average = 328/260 = 1.26 (Pardon me for leaving out the units...) Seems, the asynchronous method is more efficient (same result, lesser computations).	[reply] [d/l] [select]
Re^2: Recursive method to check overlap of many date ranges by bigbot (Beadle) on Sep 17, 2011 at 20:40 UTC
Thanks for the responses. I should have been more specific regarding what I meant when I said the "average number of open trades". I do not care if there are no trades at all open. I also do not care about how long a trade is open. If two trades are open at the same time even for 1 minute I consider that two trades open at once. So I'm not exactly sure the best calculation to use.. Perlbotics your asynchronously at open/close times seems to be what I'm looking for. So far the testing shows that it's pretty accurate. Thanks! Let me know if you have any update considering the definition I presented for average number of open trades.	[reply]
Re: Recursive method to check overlap of many date ranges by LanX (Saint) on Sep 17, 2011 at 21:24 UTC
Simple math: The average number of overlaps is the sum of all interval length divided by the complete observation period. If not, then please give example data demonstrating what you want. Cheers Rolf	[reply]
Re: Recursive method to check overlap of many date ranges by choroba (Cardinal) on Sep 17, 2011 at 09:40 UTC
Maybe inversion lists can help you? See RFC: The Lazy Manager's Calendar with Inversion Lists.	[reply]
Re: Recursive method to check overlap of many date ranges by mshipper (Novice) on Mar 30, 2012 at 17:14 UTC
I have some code on CPAN that does this.. I think. The example is take from the Data::Range::Compare::Stream::CookBook and is intended to compute exact downtime of a bank branch location when all 3 vnps are offline. The concept is the same as computing network down time when there are multiple redundant links. You just want to know where the overlap is and how long each overlap is, then average the time out based on your current data. Update: I found POSIX mktime was faster by a land slide than DateTime. Output is as follows: Total Number of Trades: 2339 Total Trade Overlaps: 8034 Average Trade Overlap time in seconds: 45 Min Tade overlap time in seconds: 0 Max Tade overlap time in seconds: 918 Min number of overlapping trades: 2 Max number of overlapping trades: 5 Number of trades that did not overlap: 1222 Average trade time for ranges that did not overlap: 171 The input format for each file was as follows. 20120228162825 - 20120228164406 20120228170024 - 20120228171347 20120228172137 - 20120228173503 20120228173630 - 20120228174720 20120228180047 - 20120228180404 20120228181045 - 20120228181901 20120228183436 - 20120228183622 20120228185051 - 20120228185807 20120228190808 - 20120228192117 20120228193517 - 20120228194152 20120228194243 - 20120228195755 20120228195826 - 20120228200730 20120228201307 - 20120228201534 20120228202250 - 20120228203621 20120228204718 - 20120228205709 20120228210425 - 20120228211237 20120228211427 - 20120228212922 20120228213936 - 20120228214346 20120228215321 - 20120228215727 20120228220610 - 20120228221833 20120228223125 - 20120228224514 20120228225429 - 20120228230900 20120228232155 - 20120228232426 20120228233838 - 20120228234145 20120228234334 - 20120228235752 #!/usr/bin/perl use strict; use warnings; use POSIX qw(mktime); use lib qw(../lib); use Data::Range::Compare::Stream::Iterator::File; use Data::Range::Compare::Stream::Iterator::Compare::Asc; use Data::Range::Compare::Stream::Iterator::Consolidate::OverlapAsColu +mn; sub parse_line { my ($line)=@_; my $ref=[$line=~ /(\d+)/g]; foreach my $date (@$ref) { my @info=unpack('a4a2a2a2a2a2',$date); $info[0] -=1900; $info[1] -=1; $date=mktime($info[5],$info[4],$info[3],$info[2],$info[1],$info[0] +, 0, 0, 0); } return $ref; } my $cmp=new Data::Range::Compare::Stream::Iterator::Compare::Asc; my @it_list; foreach my $file (qw(posix_time_a.src posix_time_b.src posix_time_c.sr +c posix_time_d.src posix_time_e.src)) { my $iterator=new Data::Range::Compare::Stream::Iterator::File( parse_line=>\&parse_line, filename=>$file, NEW_FROM=>'Data::Range::Compare::Stream::PosixTime', ); # save our file iterator so we can figure out how many lines were in + each file push @it_list,$iterator; my $con=new Data::Range::Compare::Stream::Iterator::Consolidate::Ove +rlapAsColumn($iterator,$cmp); $cmp->add_consolidator($con); } my $total=0; my $time=0; my $non_overlaps=0; my $non_time=0; my $min_overlap=undef; my $max_overlap=0; my $max_overlap_count=0; my $min_overlap_count=undef; while($cmp->has_next) { my $result=$cmp->get_next; next if $result->is_empty; if($result->get_overlap_count>1) { my $overlap_time=$result->get_common->time_count; $max_overlap_count=$result->get_overlap_count if $max_overlap_coun +t < $result->get_overlap_count; if(defined($min_overlap_count)) { $min_overlap_count=$result->get_overlap_count if $min_overlap_co +unt > $result->get_overlap_count; } else { $min_overlap_count=$result->get_overlap_count; } $total +=$result->get_overlap_count; $time +=$overlap_time; $max_overlap=$overlap_time if $max_overlap < $overlap_time; if(defined($min_overlap)) { $min_overlap=$overlap_time if $min_overlap > $overlap_time; } else { $min_overlap=$overlap_time; } } else { $non_overlaps++; $non_time +=$result->get_common->time_count; } } my $total_trades=0; foreach my $it (@it_list) { $total_trades +=$it->get_pos; } print "Total Number of Trades: $total_trades\n"; print "Total Trade Overlaps: $total\n"; print "Average Trade Overlap time in seconds: ",int($time/$total),"\n" +; print "Min Tade overlap time in seconds: $min_overlap\n"; print "Max Tade overlap time in seconds: $max_overlap\n"; print "Min number of overlapping trades: $min_overlap_count\n"; print "Max number of overlapping trades: $max_overlap_count\n"; print "Number of trades that did not overlap: $non_overlaps\n"; print "Average trade time for ranges that did not overlap: ",int($non_ +time/$non_overlaps),"\n"; { package Data::Range::Compare::Stream::PosixTime; use strict; use warnings; use POSIX qw(strftime); use base qw(Data::Range::Compare::Stream); use constant NEW_FROM_CLASS=>'Data::Range::Compare::Stream::PosixTim +e'; sub format_range_value { my ($self,$value)=@_; strftime('%Y%m%d%H%M%S',localtime($value)); } sub range_start_to_string { my ($self)=@_; $self->format_range_value($self->range_start); } sub time_count { my ($self)=@_; $self->range_end - $self->range_start } sub range_end_to_string { my ($self)=@_; $self->format_range_value($self->range_end); } 1; } [download]	[reply] [d/l]
Re: Recursive method to check overlap of many date ranges by limzz (Novice) on Sep 21, 2011 at 19:41 UTC
Average number of trades open at the same time per what? I'll guess day? Think about what an average is. You need your data points and the quantity of data. Your data is trades open per time increment, your quantity is the number of increments (e.g. if you were looking at the data over a year, then 365 days). Now the problem is a pretty easy one. You could do it many different ways. Cycle through your time increments, for each just see how many trades were open.	[reply]