I assume those numbers are unqiue identifiers NOT ranges? Bio::SeqIO from Bioperl is going to be helpful if you are ever going to have sequences in different file formats.
use Bio::SeqIO; my $in = new Bio::SeqIO(-file => 'filename', -format => 'fasta'); my %seqs; my @idlist = qw(456-3210 4670-5490); while( my $seq = $in->next_seq ) { my $id = $seq->display_id; $id =~ s/gb\|//; $seqs{$id} = $seq; } my @seqlist = grep { defined } map { $seqs{$_} } @idlist;
If there are a LOT of sequences and you will be doing this repeatedly for different ID sets you can do this more efficiently with Bio::Index::Fasta.
use Bio::Index::Fasta; my $idx = new Bio::Index::Fasta(-filename => 'seqs.idx', -write_flag => 1); $idx->id_parser(\&myidparser); $idx->make_index($seqfile); my @idlist = qw(456-3210 4670-5490); my @seqlist; for my $id ( @idlist ) { if( my $seq = $idx->get_Seq_by_acc($id) ) { push @seqlist, $seq; } } # define your own ID parser if you wanted to strip out # the gb| part of the id # alternatively don't do this and make sure the IDs # you input exactly match the IDs of the sequences sub myidparser { if( $_[0] =~ /^>\s*gb\|(\S+)/ ) { return $1; } elsif ($_[0] =~ /^>\s*(\S+)/) { return $1; } else { return; } }

In reply to Re: Iterating through files and arrays simultaneously by stajich
in thread Iterating through files and arrays simultaneously by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.