sure, i will write the complete problem i am facing. Here is the input file i am using.
sxoght: #query hit score probability qstart qend qorien
+tation tstart tend matches mismatches gapOpening gap
+s
@SNPSTER4_104_308EFAAXX:1:1:1694:128
GGGATAAGAGAGGTGCATGTTGGTATTTAAGGTAGT
1 alignment(s) -- reports limited to 10 alignment(s)
sxoght: SNPSTER4_104_308EFAAXX:1:1:1694:128 gi|122939163|ref|NM_00
+0165.3| -10 1.000000 1 36 + 1595 163
+0 35 1 00
Score = -10, P(A|R) = 1.000000
Query: 1 GGGATAAGAGAGGTGCATGTTGGTATTTAAGGTAGT 36
|||||||||||||||||||||||||||||| |||||
Sbjct: 1595 GGGATAAGAGAGGTGCATGTTGGTATTTAAAGTAGT 1630
@SNPSTER4_104_308EFAAXX:1:1:1608:94
GCAGTTTTAAGTTATTAGTTTTTAAAATCAGTACTT
14 alignment(s) -- reports limited to 10 alignment(s)
sxoght: SNPSTER4_104_308EFAAXX:1:1:1608:94 gi|113412254|ref|XR_01
+8775.1| 0 0.090884 1 36 + 1578 161
+3 36 0 00
Score = 0, P(A|R) = 0.090884
Query: 1 GCAGTTTTAAGTTATTAGTTTTTAAAATCAGTACTT 36
||| |||||||||||||||||||||||||||||| |
Sbjct: 1578 GCATTTTTAAGTTATTAGTTTTTAAAATCAGTACGT 1613
this is a big file, though the whole file looks like this.
What i am trying to do exactly is to grep the header (sxoght) for display in columns and also to display where there is a mismatch in the alignment between query and sbjct. for this input file, the expected results should look like:
>gi|122939163|ref|NM_000165.3| 1595 1630 SNPSTER4_104_308EFAA
+XX:1:1:1694:128 1 36 36 -10 1 1.000000 35
mismatch : 1625.GA
>gi|113412254|ref|XR_018775.1| 1578 1613 SNPSTER4_104_308EFAA
+XX:1:1:1608:94 1 36 36 0 1 0.090884 36
mismatch : 1581.GT 1612.TG
the code which i have written is :
#!/usr/bin/perl
open(FILE,"align.input") or die "can not open file";
while($var=<FILE>){
$str1=();$str2=();
if($var=~/^sxoght:/){
@ar=split(/\s+/,$var);
print ">$ar[2]\t$ar[8]\t$ar[9]\t$ar[1]\t$ar[5]\t$ar[10]\t$ar[6]\t$
+ar[3]\t$ar[4]\t$ar[11]\n";
}
if($var=~/^Query:/){
$str1=$var;
$str1=~s/^Query:\s+//g;
$str1=~s/\d+\s+//g;
$str1=~s/\s+//g;
}
if($var=~/^Sbjct:/){
$str2=$var;
$str2=~s/^Sbjct:\s+//g;
$str2=~s/\d+\s+//g;
$str2=~s/\s+//g;
}
for($i=0;$i<=length($str1);$i++)
{
if(substr($str1,$i,1) ne substr($str2,$i,1)){
print substr($str1,$i,1);
print substr($str2,$i,1);
print "$i\n";
}
}
}
I am not able to use "strict and warning" because using it doesnt allow me to access the scalar variable outside the loop. In my code, i m trying to extract the positions first, so that i will subtract it from the already stored @arr values of beginning and start. I am having problems with the for loop. I know where i am going wrong, but dont know how to correct it. PLEASEEEE HELP !!! |