in reply to Re^5: fasta hash
in thread fasta hash

Thanks for your help, but non of this things work even if i follow your instructions to the letter. Putting the if loop within the while loop only prints the ids and no sequences. Putting it in the big while loop prints several lines repeatedly

Replies are listed 'Best First'.
Re^7: fasta hash
by moritz (Cardinal) on Aug 26, 2011 at 19:40 UTC
Re^7: fasta hash
by Marshall (Canon) on Aug 28, 2011 at 15:32 UTC
    Instead of processing the second file line by line, it is possible to use the '>' character as an end-of-line input separator. This simplifies processing of the file.
    #!/usr/bin/perl -w use strict; open (IDS, '<', "fastaIds") or #with your data 2056360012 2056360013 die "cannot open fastaIds $!\n"; #-------------# my %ids; while (<IDS>) #process first file with only ID's { chomp; $ids{$_}=1; } #-------------# $/='>'; #input record separator is now '>' while (<DATA>) { chomp; # now works on '>' not \n next if /^\s*$/; # first record will be blank my ($id) = /^\s*(\d+)/; print ">$_" if $ids{$id}; #print all lines for this id } #-------------# =prints > 2056360012 1047627436237 yyyacgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac > 2056360013 1047627436238 xxxxcgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac =cut __DATA__ > 2056360012 1047627436237 yyyacgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac > 2056360013 1047627436238 xxxxcgagchagshgashcgahcgac acsasasasacsacsasasacaca ascassacsaascascascascac