in reply to Re: Overlapping portions of sub strings
in thread Overlapping portions of sub strings
I posted the original question under ‘Anonymous’. What in fact I was talking about was start and end positions for genes within a genome. I wanted to merge overlapping CDS (portions of gene) regions of a genome. Maybe there are too many figures to fit in an array for your method to work with some large data sets. The method I chose was to generate a list in which each line of the list contained the start and end positions of a single CDS region. The list was first ordered by start position and then by end position. New lists were produced recursively. On each recursion, where overlapping was found between a set of two start and end positions, a single line was produced in the new list where a merged region was represented by a start and end position. The reason I said that this was not full proof was because I put a limit on the number of recursions that could take place (in order to limit the number of files generated). I guess that my method could also offer advantages in that line tagging could be used to deal with more complex merging scenarios. Maybe, for example, yellow portions of string could be merged with blue portions but not red.
Does anyone know of how best these methods can be represented mathematically?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: Re: Overlapping portions of sub strings
by BrowserUk (Patriarch) on Jan 16, 2003 at 16:01 UTC |