baxy77bax has asked for the wisdom of the Perl Monks concerning the following question:
Well i have a question for you guys. so what i need is, i need to do some interval searching. Example:
I have a set of random (randomly distributed) intervals (all related in the following way a<=b, where a and b are start and stop positions of intervals)
now what i need to do is to compute the coverage of a certain position in my interval. so if i have :
my coverage of the 1-10 would be :1..10 - interval and 2..5 - subinterval 4..7 - subinterval
the naive solution to do this would be to create an array and then for every interval position ++ the value indexed by the position. BUT this is not an option since i don't have enough memory on my machine (the intervals are as long ad 10^9 positions). Now what i could do is something like this:1->0 2->1 3->1 4->2 5->2 6->1 7->1 8->0 9->0 10->0
so, create a vector (or should i say several of them) and fill it with subinterval positions... Oh by the way every set of subintervals contains only non-overlapping intervals, meaning there cannot be a case as stated above , but the subintervals from different sets CAN overlap. so in the above case one subinterval would be from one set and the other from a different set(The number of intervals in both subinterval sets can be as high as interval/avg(length of subinterval)).#!/usr/bin/perl use Bit::Vector; use Data::Dumper; use strict; my @vecArray; my $x=0; while($x<4){ $vecArray[$x] = Bit::Vector->new(21); my $int = int(rand(21)); my $min = $int; my $max = $int<=20-5 ? $int+5 : $int; $vecArray[$x]->Interval_Fill($min,$max); $x++; } my $set1= Bit::Vector->new(21); $set1->Intersection($vecArray[0],$vecArray[2]); my $start = 0; while ($start < $set1->Size() and my ($min, $max) = $set1->Interval_Sc +an_inc($start)){ print $min . "-" . $max . "\n"; $start = $max+1 ; }
Does someone has a suggestion , tip, alternative approach ??|interval| = X |subinterval| = X/n (n- number of intervals) Y=X (Y - nubmber of subinterval sets) O(X^2) - not good if implemented naively
Cheers
UPDATE:
Thanks ppl, these three replays gave me quite a lot to think about. so, to give a quick global thank you and a few answers to monks that have replied so far.
I know you guys are not mind readers but you are really good at it :). so:
zek152 : This is a nice idea and as soon as you posted it i started to work on it, actually i knew i read about it somewhere before (never had a need to actually do any implementation though) so when i saw your post i started working on it straight away. but my original idea was the same as the one the BrowserUk posted, though i didn't realised that it can be achieved it the way he did it. so the answer to BrowserUk's question is : yes. the thing i had in mind when i was writing this post was that i have to do a pair-wise comparison for every two possible pairs of vectors then we are talking about "for within a for loop" and that is O(x^2) but as you pointed out looping the intersect through vectors does the same thing essentially (meaning useful to me) but in the O(x) time. so once more a fresh pair of eyes see better then sleepy old ones.
Of course searching through start and stop indexes is a good solution but unfortunately my intervals are rather short (sorry for not stating that) which means too large index arrays... But thanx.
So thank you guys once more and this is definitely going to be a good reference post for me , of course if anyone else has an alternative strategy he/she is more then welcome to post it :)
Cheers
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: searching a vector array
by Grimy (Pilgrim) on Jun 09, 2011 at 17:17 UTC | |
|
Re: searching a vector array
by BrowserUk (Patriarch) on Jun 09, 2011 at 19:12 UTC | |
|
Re: searching a vector array
by zek152 (Pilgrim) on Jun 09, 2011 at 16:30 UTC | |
by zek152 (Pilgrim) on Jun 09, 2011 at 18:50 UTC | |
|
Re: searching a vector array
by johngg (Canon) on Jun 09, 2011 at 23:42 UTC |