in reply to Extracting BLAST hits from a list of sequences
Given your datasets, perhaps the following will be helpful (it correctly returns the one BLAST entry which contains a matching fasta header):
use strict; use warnings; my %headers; while (<>) { $headers{$1} = 1 if /^>(.+)/; last if eof; } local $/ = 'Query= '; while (<>) { chomp; print $/. $_ if /(.+)/ and defined $headers{$1}; }
Usage: perl script.pl headerFile blastFile [>outFile]
The last, optional parameter directs output to a file.
The local $/ = 'Query= '; sets file reading to one BLAST entry ('record') at a time. The header info after Query= is captured and the complete entry is printed if that header is in the hash.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Extracting BLAST hits from a list of sequences
by no_slogan (Deacon) on Jan 20, 2014 at 16:44 UTC | |
by Kenosis (Priest) on Jan 20, 2014 at 18:32 UTC |