I have a sequence file seq.txt which gives each sequence of length 150bp at the specific position.
seq.txt
position range seq
46 1..1524 -----------------
832 1..1524 -----------------
1008 1..1524 -----------------
1407 1..1524 -----------------
2360 2052..3260 ------------------
2967 2052..3260 ------------------
403 1..1524 -----------------
800 1..1524 -----------------
2986 2052..3260 ------------------
3170 2052..3260 ------------------
I want to find out how much sequence is covered within a particular range. For example range 1...1524 is reported with 6 positions 46, 403, 800, 832, 1008, 1407 where there is an overlap between 800 and 832.
The output should give (150+150+32+118+32+150+150) = 782
With my script I am trying to create an array which stores 1 to 1524. Making the array 1 if it contains the position+150 else the array is set to 0. But there is a problem with the loop I think which is not giving the correct output. Any help or idea will be appreciated. My script is:
use strict;
use warnings;
my $i;
my $j;
my $seq;
open my $m, '<', 'count_try.txt' or die 'Cannot open count_try.txt';
while ($seq = <$m>) {
chomp $seq;
my (@range) = split(/\t/, $seq);
my ($fm, $to) = split(/\.\./, $range[1]);
my $r = $range[0];
for ($i = $fm; $i <= $to; $i++)
{ $j = $r + 150;
if ($i >= $r and $i <= $j)
{$range[$i] = 1;}
else
{$range[$i] = 0;}
print $range[$i]."\t";
}
}
close $m;
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.