in reply to Re: alphabet counting
in thread alphabet counting

thank you. i have 428 KB fasta file such as text.there are 1000 protein in it.so, i have to have 1000 line in result which give me the ferequency of each amino acid ( alphabet), but my result file is 14 MB which cannot handle with notepad.

Replies are listed 'Best First'.
Re^3: alphabet counting
by Corion (Patriarch) on Jun 02, 2012 at 13:29 UTC

    Instead of always testing with the whole set, I recommend testing with only two or three proteins. That way, your tests should run quicker and you should be able to inspect the results with notepad (for example). You can also print the results to the console instead of a file to see them.

      thank you for your help. i did that, but now i see , there is an error in counting. code has error.

      >tr|F5HB16|F5HB16_HUMAN Alcohol dehydrogenase 1B OS=Homo sapiens GN=AD +H1B PE=2 SV=1 MSTAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICHTDDHVVSGNLVT PLPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNP RGKPIHHFLGTSTFSQYTVVDENAVAKIDAASPLEKVCLIGCGFSTGYGSAVNVAKVTPG STCAVFGLGGVGLSAVMGCKAAGAARIIAVDINKDKFAKAKELGATECINPQDYKKPIQE VLKEMTDGGVDFSFEVIGRLDTMMASLLCCHEACGTSVIVGVPPASQNLSINPMLLLTGR TWKGAVYGGFKSKEGIPKLVADFMAKKFSLDALITHVLPFEKINEGFDLLHSGKSIRTVL TF >sp|P00325|ADH1B_HUMAN Alcohol dehydrogenase 1B OS=Homo sapiens GN=ADH +1B PE=1 SV=2 MSTAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVT PLPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNP RGTLQDGTRRFTCRGKPIHHFLGTSTFSQYTVVDENAVAKIDAASPLEKVCLIGCGFSTG YGSAVNVAKVTPGSTCAVFGLGGVGLSAVMGCKAAGAARIIAVDINKDKFAKAKELGATE CINPQDYKKPIQEVLKEMTDGGVDFSFEVIGRLDTMMASLLCCHEACGTSVIVGVPPASQ NLSINPMLLLTGRTWKGAVYGGFKSKEGIPKLVADFMAKKFSLDALITHVLPFEKINEGF DLLHSGKSIRTVLTF