You should only print the output if $str1 and $str2 are both set, and then, clear them so you don't print them again.

The next is very close to your code, but kind of works. You only need to format the output as desired (I don't know where the large numbers come from):

#!/usr/bin/perl open(FILE,"align.input") or die "can not open file: $!"; while($var=<FILE>){ if($var=~/^sxoght:/){ ($str1,$str2)=(); @ar=split(/\s+/,$var); print ">$ar[2]\t$ar[8]\t$ar[9]\t$ar[1]\t$ar[5]\t$ar[10]\t$ar[6 +]\t$ar[3]\t$ar[4]\t$ar[11]\n"; } if($var=~/^Query:/){ $str1=$var; $str1=~s/^Query:\s+//g; $str1=~s/\d+\s+//g; $str1=~s/\s+//g; } if($var=~/^Sbjct:/){ $str2=$var; $str2=~s/^Sbjct:\s+//g; $str2=~s/\d+\s+//g; $str2=~s/\s+//g; } if(defined $str1 and defined $str2) { for($i=0;$i<=length($str1);$i++) { if(substr($str1,$i,1) ne substr($str2,$i,1)){ # this is not in the desired format, yet print substr($str1,$i,1); print substr($str2,$i,1); print "$i\n"; } } ($str1,$str2)=(); } }
For the format, I propose to store the results in an array, and print "mismatch:" only if the array isn't empty at the end.
my @mismatch; for($i=0;$i<=length($str1);$i++) { if(substr($str1,$i,1) ne substr($str2,$i,1)){ push @mismatch, "$i." . substr($str1,$i,1) . substr($s +tr2,$i,1); } } if(@mismatch) { print "mismatch: @mismatch\n"; }
After that modification, the output I get for this file is
>hit tstart tend #query qstart matches qend score + probability mismatches >gi|122939163|ref|NM_000165.3| 1595 1630 SNPSTER4_104_308EFAA +XX:1:1:1694:128 1 35 36 -10 1.000000 1 mismatch: 30.GA >gi|113412254|ref|XR_018775.1| 1578 1613 SNPSTER4_104_308EFAA +XX:1:1:1608:94 1 36 36 0 0.090884 0 mismatch: 3.GT 34.TG

p.s. There's a possible speed improvement if you XOR (^) the two strings, you'll get a string of null bytes for where they are the same and non null where they are not:

my $xor = $str1 ^ $str2; while($xor =~ /[^\0]/g) { my $i = pos($xor) - 1; # or: $-[0] push @mismatch, "$i." . substr($str1,$i,1) . substr($str2, +$i,1); }

In reply to Re^3: match and mismatch by bart
in thread match and mismatch by heidi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.