Hello everyone, I have a script where I read entries from a file, one at a time. I want to count the number of alphabets in each entry, length of each entry, etc. (code attached below). My script works fine for everything else, but it does not return me the length of the entry in the '$len' variable. So, for example, if the the length if the first entry has 235 letters, I want it to return the output as "Sequence length = 235", and so on. The script runs without any error but doesn't generate an output for the length. Can anybody tell me what I'm doing wrong?

#!/usr/bin/perl use strict; use warnings; print 'Please enter protein sequence filename: '; chomp( my $prot_filename = <STDIN> ); open my $PROTFILE, '<', $prot_filename or die "Cannot open '$prot_filename' because: $!"; my $report_name = $prot_filename.'_report'; open my $out_file, '>', $report_name or die "Cannot open '$report_name' because: $!"; $/ = ''; # Set paragraph mode my @count=(); my %absent=(); my $name; my $len; while ( my $para = <$PROTFILE> ) { # Remove fasta header line if ( $para =~ s/^>(.*)//m ){ $name = $1; }; # Remove comment line(s) $para =~ s/^\s*#.*//mg; #Remove trailing spaces between text #$space =~ s/\s+$//; my %prot; $para =~ s/([ACDEFGHIKLMNPQRSTVWY])/ ++$prot{ $1 } /eg; $len = length($para); my $num = scalar keys %prot; push @count,[$num,$name]; printf "Counted %d for %s ..\n",$num,substr($name,0,50); print $out_file "$name\n"; print $out_file join( ' ', map "$_=$prot{$_}", sort keys %prot ), +"\n"; printf $out_file "Amino acid alphabet = %d\n\n",$num ; print $out_file "Sequence length = ", $len; # count absent for ('A'..'Z'){ ++$absent{$_} unless exists $prot{$_}; }; }; # sort names by count in ascending order to get lowest my @sorted = sort { $a->[0] <=> $b->[0] } @count; my $lowest = $sorted[0]->[0]; # maybe more than 1 lowest printf $out_file "Least number of amino acids is %d in these entries\n +",$lowest; my @lowest = grep { $_->[0] == $lowest } @sorted; print $out_file "$_->[1]\n" for @lowest; # show all results print $out_file "\nAll results in ascending count\n"; for (@sorted){ printf $out_file "%d %s\n",@$_; }; close $out_file; print "Results are printed in $report_name\n"; # print absent counts print "\nExclusion of various amino acids in $prot_filename is as foll +ows\n"; for (sort keys %absent){ printf "%s=%d\n",$_,$absent{$_}; };

In reply to Count the sequence length of each entry in the file by davi54

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.