Hi all, I have an alignment file and I want to analyze it and output all of the positions that are not conserved. I was thinking of reading in the file as an array of arrays. Below is an example input file. And below that is my current code. Which currently produces the wrong output, and I feel is probably not the most correct way to go about this. Any help is appreciated.
5 15 1 ATCG--ATCG-ATCG 2 ATGC--ATCG-ATCG 3 ATGC-A-TCG-ATCG 4 ATGC--ATCG-ATCG 5 ATCG--ATCG-AACG
use strict; use warnings; my $align_file=$ARGV[0]; #.py (phylip file right now) open (F, $align_file) || die "Could not open $align_file: $!"; my @align_matrix; my %HoTaxids; my @columns; while (my $line = <F>) { chomp($line); <F>; # skip firts line of phylip next if $line =~ /^\s*$/; # skip blank lines my ($taxid,$align_line)=split(/\s/,$line); @columns = split(undef, $align_line);#splitting on undef split +s each character push (@align_matrix, \@columns); $HoTaxids{$taxid}=1; } close(F); my $num_rows=scalar(keys %HoTaxids); my $align_len=scalar(@columns); for (my $i=0;$i<$num_rows;$i++) { for (my $j=0;$j<$align_len;$j++){ if ($align_matrix[$i][$j] ne $align_matrix[$i][$j+1]) { print "position\t$j\t is not conserved\n"; } } }
In reply to manipulating alignment data by AWallBuilder
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |