Hello everyone, I have a script where I read entries from a file, one at a time. I want to count the number of alphabets in each entry, length of each entry, etc. (code attached below). My script works fine for everything else, but it does not return me the length of the entry in the '$len' variable. So, for example, if the the length if the first entry has 235 letters, I want it to return the output as "Sequence length = 235", and so on. The script runs without any error but doesn't generate an output for the length. Can anybody tell me what I'm doing wrong?
#!/usr/bin/perl use strict; use warnings; print 'Please enter protein sequence filename: '; chomp( my $prot_filename = <STDIN> ); open my $PROTFILE, '<', $prot_filename or die "Cannot open '$prot_filename' because: $!"; my $report_name = $prot_filename.'_report'; open my $out_file, '>', $report_name or die "Cannot open '$report_name' because: $!"; $/ = ''; # Set paragraph mode my @count=(); my %absent=(); my $name; my $len; while ( my $para = <$PROTFILE> ) { # Remove fasta header line if ( $para =~ s/^>(.*)//m ){ $name = $1; }; # Remove comment line(s) $para =~ s/^\s*#.*//mg; #Remove trailing spaces between text #$space =~ s/\s+$//; my %prot; $para =~ s/([ACDEFGHIKLMNPQRSTVWY])/ ++$prot{ $1 } /eg; $len = length($para); my $num = scalar keys %prot; push @count,[$num,$name]; printf "Counted %d for %s ..\n",$num,substr($name,0,50); print $out_file "$name\n"; print $out_file join( ' ', map "$_=$prot{$_}", sort keys %prot ), +"\n"; printf $out_file "Amino acid alphabet = %d\n\n",$num ; print $out_file "Sequence length = ", $len; # count absent for ('A'..'Z'){ ++$absent{$_} unless exists $prot{$_}; }; }; # sort names by count in ascending order to get lowest my @sorted = sort { $a->[0] <=> $b->[0] } @count; my $lowest = $sorted[0]->[0]; # maybe more than 1 lowest printf $out_file "Least number of amino acids is %d in these entries\n +",$lowest; my @lowest = grep { $_->[0] == $lowest } @sorted; print $out_file "$_->[1]\n" for @lowest; # show all results print $out_file "\nAll results in ascending count\n"; for (@sorted){ printf $out_file "%d %s\n",@$_; }; close $out_file; print "Results are printed in $report_name\n"; # print absent counts print "\nExclusion of various amino acids in $prot_filename is as foll +ows\n"; for (sort keys %absent){ printf "%s=%d\n",$_,$absent{$_}; };
In reply to Count the sequence length of each entry in the file by davi54
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |