I wanna match array element patterns($pats
$i) with the row sequence($str). Also find the overlapping region location in the sequence($str). Please help me.
But my loop is not work properly.
Thanks.
INPUT FILES:
seq.txt
>seq1
AAAAAA
>seq2
TTTTT
pat.txt
seq1 AAAAA TTTTT GGGGG
I want following output:
Patterns sequence name start end length of seq.
AAAAA seq1 1 5 6
AAAAA seq1 2 6 6
TTTTT seq2 1 5 5
not found
GGGGG
I wrote following PERL script:
#!/usr/bin/perl-w
# perl program to find motif in text file eith its positions.
use strict;
use Bio::SeqIO;
open(my $outfile, ">", "Motif_Result.txt");
my $file = 'seq.txt';
print {$outfile}"\t\t\t\t\t Positions\n";
print {$outfile}"\tPattern Name\tSequence Name\tStart\tEnd\tLength of Sequence\n\n";
my $patternseq= 'pat.txt';
open(FIH, $patternseq);
my $in;
my %patsmap; my @residue; my @pats; my $seqcount = 0;
while ($in=<FIH>)
{
chomp($in);
my @pats=split " ",$in;
my $residue=shift@pats;
foreach my $nnn (@pats) # @pats = (AAAAA, TTTTT, GGGGG);
{
$patsmap{$nnn}=@residue;
}
print "Patterns\t"; print "@pats"; print "\n";
my $length = @pats;
print "Total Patterns are: $length"; print "\n";
my $seqobj;
my $in = Bio::SeqIO->new(-format => 'fasta',
-file => $file);
my $motif_count = 0;
while ( my $seq = $in->next_seq)
{
$seqcount++; # count the number of sequences
my $head = $seq->display_id();
my $str = $seq->seq; # get the sequence as a string
$str =~ s/\*//g; # remove all '*' from sequence
my $i;
for ($i=0; $i<=$length; $i++) {
while ( $str =~ m/(?=$pats
$i)/ig)
{
my @l = split('', $str);
my $length=@l;
print {$outfile}"\t"; print {$outfile}$pats
$i;
printf {$outfile}"\t\t$head\t\t%d", pos($str)+1;
printf {$outfile}"\t%d", pos($str)+5;
print {$outfile}"\t"; print {$outfile}$length; print {$outfile}"\n";
}
}
}
print {$outfile}"\n\tTotal Sequences in file name $file\t:$seqcount\n";
exit;
}
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.