I think it would be prudent for you to refrain from writing mismatch tools etc. unless you REALLY know what you are about. Simply interjecting a * between mismatched nucleotides only works if the sequences are equal length and have no more correct alignment, and has only limited usefulness besides. Are you prepared to write a fully-fledged, correct alignment tool just so you can find mismatches? If you are, you are either a glutton for punishment or absolutely silly, or both, and hopefully brilliant regardless. ;) To align two sequences just use pairwise BLAST or the EMBOSS Smith-Waterman tool. Whatever you do, I recommend that you check out bioperl (http://www.bioperl.org or find it on CPAN in the Miscellaneous->Bio namespace) v1.02 first. You can save loads of development time and be using stronger code that has been tested and improved by lots of people. There is a lot of useful functionality. Granted, some of it is the bleeding edge of development so there are a lot of bugs and it is improved constantly. But the basics of bioinformatics, like converting between formats (EMBL to FASTA for example) and BLASTing (and more importantly, parsing BLAST output), and calling/parsing EMBOSS tools have all been around for a while and so are (dare I say it) trustworthy in bioperl. The code is also readable and easy to program. To convert from raw to FASTA, for example:
use Bio::SeqIO;
my $instream = Bio::SeqIO->new( -file => 'your_in_file',
-format => 'raw' );
my $outstream = Bio::SeqIO->new( -file => '>>your_out_file',
-format => 'Fasta' )';
while ( my $sequence = $instream->next_seq() ) {
$outstream->write_seq( $sequence );
}
Take it from someone who has been down this road: it's hard to worm your way into understanding how to use bioperl's many interlocking modules but once you define the (I would bet) small subset that you need, which is the hard part, any time spent is time saved in the long run. Don't reinvent the wheel.
Good luck. :)