Ohh, I see. I instead placed it like this:
$para =~ s/^\s*#.*//mg;
$len = length($para);
and got the output. However, the count is incorrect. For, example, if you look at the first entry, after removing the header, the sequence length (alphabets in uppercase) is 111, whereas the script gives me 115 as the output for sequence length. I don't know what I'm doing wrong, why is the script returning me wrong value? | [reply] [d/l] |
When you strip off the header the line endings remain. Try deleting "\n" characters. If on windows also delete "\r" characters.
# Remove fasta header line
if ( $para =~ s/^>(.*)//m ){
$name = $1;
};
# Remove comment line(s)
$para =~ s/^\s*#.*//mg;
$para =~ tr/\r\n//d;
| [reply] [d/l] |
If on windows also delete "\r" characters.
This is not necessary as the PerlIO :crlf layer is default on Windows and converts CRLF to LF on input. One can disable the translation with binmode or the :raw pseudolayer, but that's not the case in any of the code shown here. See also Newlines in perlport, and note that chomp also handles paragraph mode correctly.
| [reply] [d/l] [select] |