http://qs1969.pair.com?node_id=288603


in reply to Re: Re: BioInformatics - polyA tail search
in thread BioInformatics - polyA tail search

Are you sure that's what you want? You know your problem-space, so I'm just checking. There are three issues I see:

  1. Fasta doesn't have to be 80 characters per-line
  2. Fasta doesn't have to be ctrl-M delimited
  3. Are you sure you don't want to check strand identity (e.g. look for poly-T sequences)?

Either way, the simple answer is to parse the file using SeqIO (BioPerl). Extract the sequence, then run a reg-ex to look for what you want. In your case it's $seq =~ /[AN]{10,}/ I think.