One option is to use a module that implements interval sets. CPAN has many; I can't tell which is to be recommended. Here's an example using Set::IntSpan.

#! /usr/bin/perl use strict; use warnings; use Set::IntSpan; my $seq_len = 150; my (%seqs, %ranges); for (<>) { m/^(\d+)\s+(\d+)\.\.(\d+)/ or next; $seqs{$1} //= [ $1 +0, $1 + $seq_len-1 ]; $ranges{$2,$3} //= [ $2, $3 ]; } my $S = Set::IntSpan->new([ values %seqs ]); printf "Total seq coverage: %d\n", $S->size; for (@ranges{ sort keys %ranges }) { my $window = [ $_->[0], $_->[1] + $seq_len-1 ]; my $T = $S->intersect([ $window ]); printf "range [%d..%d] window [%d..%d] %s covers %d\n", @$_, @$window, $T->run_list, $T->size; } __END__ Total seq coverage: 1251 range [1..1524] window [0..1673] 46-195,403-552,800-981,1008-1157,14 +07-1556 covers 782 range [2052..3260] window [2051..3409] 2360-2509,2967-3135,3170-3319 + covers 469

To me it looks like there's a bug in that module (ver 1.19): documentation promises the operands are not affected, yet intersect method appears to modify $window. Hm.


In reply to Re: find the coverage of sequence in a particular range by oiskuu
in thread find the coverage of sequence in a particular range by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.