Hi All,

I'm reasonably new to perl and very new to BioPerl, so my apologies if this seems like a trivial question. I'm using Bio::AlignIO and Bio::SimpleAlign to generate pairwise alignments of sequences of interest to a reference sequence - in this case the human mtDNA reference sequence. I'm able to generate the alignment in fasta format without issue, however I wish to record the differences between the reference and the seuqences of interest in a tab delimited file for use with the webtool Haplogrep (http://haplogrep.uibk.ac.at/index.html). I can imagine doing this as loop whereby I go through each base of the reference sequence and say if the position in the reference does not match the query sequence, record the position and the nucleotide present in the query sequence. However, I can't figure out how to do this from the Bio::AlignIO object generated as part of the alignment. Here is the code I have so far:

#!/usr/bin/perl use strict; use warnings; use Bio::AlignIO; use Bio::SimpleAlign; my $ref = "rCRS.fas"; my $comp = shift @ARGV or die "please provide comparison sequence: $!\ +n"; my $comp_id; if ($comp =~ /(\S*)\.fas*/) { $comp_id = $1; } my $CAT = "cat $ref ".$comp." > ".$comp.".tmp"; try_cmd($CAT); my $ALN = "muscle -in ".$comp.".tmp -out ".$comp.".aln"; try_cmd($ALN); my $str = Bio::AlignIO->new( -file => $comp.".aln", -format => "fasta", ); my $aln = $str->next_aln(); ####Subroutine#### sub try_cmd { my $cmd = shift; my $status; $status = system($cmd); if ( $status ) { print STDERR ( "problem running $cmd\nReturned $status n" ); } else { print STDERR ( "Successfully ran $cmd\n" ); } }

So to give a brief example, if I had generated an alignment such as this one:

>Ref_seq AATTTGGGCTACT >Query_seq AAATTCGGCTACA

I would then like to generate an output such as:

3A 6C 12A

Can anyone help me figure out how to do so? General comments on better/easier ways to do this are also greatly appreciated.


In reply to BioPerl: Annotate mismatches in an alignment by r36atd

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.