#!/usr/bin/env perl use 5.010; for (<>) { if (/^>/) { # Header } elsif (/^[A-Z]+$/) { # Protein my $a = tr/A/A/; say "A: $a, length: " . length; } } ~
There are two issues I am facing right now. First, some of the sequence entries in the input file are long and are continued on the next line (see below for example). But this script reads only the first line (before moving on to the second entry) due to which I'm getting wrong values for the length and number of 'A's that I want. Is there a way to fix this?
Example sequence:Second, This script is giving me the output on the terminal. I want it to give me the output in a file. How and where do I declare the output file details?>sp|P76347|YEEJ_ECOLI Uncharacterized protein YeeJ OS=Escherichia coli + (strain K12) OX=83333 GN=yeeJ PE=3 SV=3 MATKKRSGEEINDRQILCGMGIKLRRLTAGICLITQLAFPMAAAAQGVVNAATQQPVPAQ IAIANANTVPYTLGALESAQSVAERFGISVAELRKLNQFRTFARGFDNVRQGDELDVPAQ VSEKKLTPPPGNSSDNLEQQIASTSQQIGSLLAEDMNSEQAANMARGWASSQASGAMTDW LSRFGTARITLGVDEDFSLKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWR HFTPTWMSGINFFFDHDLSRYHSRAGIGAEYWRDYLKLSSNGYLRLTNWRSAPELDNDYE ARPANGWDVRAESWLPAWPHLGGKLVYEQYYGDEVA
In reply to Re^4: How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence?
by davi54
in thread How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence?
by davi54
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |