Re: Count the sequence length of each entry in the file

    my %prot;
    $para =~ s/([ACDEFGHIKLMNPQRSTVWY])/ ++$prot{ $1 } /eg;
    $len = length($para);
[download]

You are getting the length after you modify the variable!

Comment on Re: Count the sequence length of each entry in the file Download Code

Replies are listed 'Best First'.
Re^2: Count the sequence length of each entry in the file by davi54 (Sexton) on Oct 01, 2020 at 20:27 UTC
Ohh, I see. I instead placed it like this: `$para =~ s/^\s#.//mg; $len = length($para);` [download] and got the output. However, the count is incorrect. For, example, if you look at the first entry, after removing the header, the sequence length (alphabets in uppercase) is 111, whereas the script gives me 115 as the output for sequence length. I don't know what I'm doing wrong, why is the script returning me wrong value?	[reply] [d/l]
Re^3: Count the sequence length of each entry in the file by wazat (Monk) on Oct 02, 2020 at 23:44 UTC
When you strip off the header the line endings remain. Try deleting "\n" characters. If on windows also delete "\r" characters. `# Remove fasta header line if ( $para =~ s/^>(.)//m ){ $name = $1; }; # Remove comment line(s) $para =~ s/^\s#.*//mg; $para =~ tr/\r\n//d;` [download]	[reply] [d/l]
Re^4: Count the sequence length of each entry in the file by haukex (Archbishop) on Oct 03, 2020 at 07:44 UTC
If on windows also delete "\r" characters. This is not necessary as the PerlIO `:crlf` layer is default on Windows and converts CRLF to LF on input. One can disable the translation with binmode or the `:raw` pseudolayer, but that's not the case in any of the code shown here. See also Newlines in perlport, and note that chomp also handles paragraph mode correctly.	[reply] [d/l] [select]
Re^5: Count the sequence length of each entry in the file by wazat (Monk) on Oct 03, 2020 at 20:25 UTC