Re: indexing segments

Not really an answer to your question but a solution:

You could create a database table such as:

create table elements (
    position      int8,
    start_value   int8,
    end_value     int8
);
[download]

Then, after filling the table, this query would give you the desired result:

select position from element where end_value >= 50000 and start_value 
+<= 1000000;
[download]

This effectively gives the tree hassle to the database.

UPDATE: Changed the SQL query to reflect the original query in the question (and a typo)

--tidiness is the memory loss of environmental mnemonics

Comment on Re: indexing segments Select or Download Code

Replies are listed 'Best First'.
Re: indexing segments by Abigail-II (Bishop) on Oct 10, 2003 at 12:53 UTC
This effectively give the tree hassle to the database. Well, it's more saying to the database "please perform a linear search for me". Kind of an overkill to use a database for that, you might as well use a grep statement in Perl. And no, creating indices on start_value and end_value isn't going to help you in this case - it may still lead to a table scan (and hence a linear search), and just report a single match. Abigail	[reply]
Re: Re: indexing segments by EvdB (Deacon) on Oct 10, 2003 at 13:03 UTC
I don't think it is a straight linear scan - there is possible confusion over what the data is: I have a sizeable array of segments (> 100K), each element basically has a length of about 100-1000 and a start/end value anywhere between 1 and a few hundred million. If it is just a list of single values then 'a length of about 100-1000' and 'a start/end value anywhere between 1 and a few hundred million' are mutually exclusive. I may be completely wrong but it would appear that each array element is more involved than just a single value but can be summarised with a start and end value. This make the database more suitable. There are also the below mentioned storage considerations. That said if it is just a list of values you are correct - grep or similar would be the way forward. --`tidiness is the memory loss of environmental mnemonics`	[reply] [d/l]
Re: Re: Re: indexing segments by glwtta (Hermit) on Oct 10, 2003 at 13:35 UTC
If it is just a list of single values then 'a length of about 100-1000' and 'a start/end value anywhere between 1 and a few hundred million' are mutually exclusive. I guess I phrased this poorly. It is not a list of single values, each element has two values: a start position, and an end position. What I meant was that the space these elements occupy can go up to a few hundred million (ie, with only a few hundred thousand elements it is fairly sparsely populated), but the length of each element (end - start) is usually no more than a few thousand. Hope that clarifies it. That said if it is just a list of values you are correct - grep or similar would be the way forward. grep is the slow way to go, it not being good enough is why I am seeking the wisdom in the first place :)	[reply]
Re: Re: Re: Re: indexing segments by EvdB (Deacon) on Oct 10, 2003 at 13:50 UTC
Re: Re: indexing segments by glwtta (Hermit) on Oct 10, 2003 at 13:21 UTC
Hm, I'm pretty sure this would do a seq scan on the table. In any case, that doesn't help me; my problem is that a linear search in memory would be too slow, throwing a database in there is definitely not going to help.	[reply]