There are couple of ways to do it. If you are interested in using the Clustalw (EBI) then you have to install Bioperl module for your perl to give the multiple sequences as a set and get the output through it. The module within Bioperl package for using it is here: Clustalw. Have a look at the synopsis as it very clearly defined
The other way is by writing a Needleman Wunsch Program as a function in your perl program, providing the sequences one by one and making it compare with the ref seq sequence. Your Needleman Wunsch function is as follows:
# Needleman-Wunsch Algorithm sub NeedlemanWunsch { my ($seq1, $seq2) = @ARGV; # scoring scheme my $MATCH = 1; # +1 for letters that match my $MISMATCH = -1; # -1 for letters that mismatch my $GAP = -1; # -1 for any gap # initialization my @matrix; $matrix[0][0]{score} = 0; $matrix[0][0]{pointer} = "none"; for(my $j = 1; $j <= length($seq1); $j++) { $matrix[0][$j]{score} = $GAP * $j; $matrix[0][$j]{pointer} = "left"; } for (my $i = 1; $i <= length($seq2); $i++) { $matrix[$i][0]{score} = $GAP * $i; $matrix[$i][0]{pointer} = "up"; } # fill for(my $i = 1; $i <= length($seq2); $i++) { for(my $j = 1; $j <= length($seq1); $j++) { my ($diagonal_score, $left_score, $up_score); # calculate match score my $letter1 = substr($seq1, $j-1, 1); my $letter2 = substr($seq2, $i-1, 1); + if ($letter1 eq $letter2) { $diagonal_score = $matrix[$i-1][$j-1]{score} + $MATCH; } else { $diagonal_score = $matrix[$i-1][$j-1]{score} + $MISMATCH; } # calculate gap scores $up_score = $matrix[$i-1][$j]{score} + $GAP; $left_score = $matrix[$i][$j-1]{score} + $GAP; # choose best score if ($diagonal_score >= $up_score) { if ($diagonal_score >= $left_score) { $matrix[$i][$j]{score} = $diagonal_score; $matrix[$i][$j]{pointer} = "diagonal"; } else { $matrix[$i][$j]{score} = $left_score; $matrix[$i][$j]{pointer} = "left"; } } else { if ($up_score >= $left_score) { $matrix[$i][$j]{score} = $up_score; $matrix[$i][$j]{pointer} = "up"; } else { $matrix[$i][$j]{score} = $left_score; $matrix[$i][$j]{pointer} = "left"; } } } } # trace-back my $align1 = ""; my $align2 = ""; # start at last cell of matrix my $j = length($seq1); my $i = length($seq2); while (1) { last if $matrix[$i][$j]{pointer} eq "none"; # ends at first cell o +f matrix if ($matrix[$i][$j]{pointer} eq "diagonal") { $align1 .= substr($seq1, $j-1, 1); $align2 .= substr($seq2, $i-1, 1); $i--; $j--; } elsif ($matrix[$i][$j]{pointer} eq "left") { $align1 .= substr($seq1, $j-1, 1); $align2 .= "-"; $j--; } elsif ($matrix[$i][$j]{pointer} eq "up") { $align1 .= "-"; $align2 .= substr($seq2, $i-1, 1); $i--; } } $align1 = reverse $align1; $align2 = reverse $align2; print "$align1\n"; print "$align2\n"; }
I would prefer that you should use the Clustalw rather than using a needleman wunsch program as it would answer your questions related to the matrices that you want to use. I hope it works for you. All the best
In reply to Re: how to construct multiple sequence alignment from group of pairwise alignments
by snape
in thread how to construct multiple sequence alignment from group of pairwise alignments
by gudluck
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |