sarf13 has asked for the wisdom of the Perl Monks concerning the following question:

Hi.. I was trying the following script for extracting sequences in fasta format from a list of geneID from genbank. I got that error message when i tried large number of sequence download(some 1000-2000) at a time. the script does not work for large number of sequence to be extracted at a time.

use Bio::Perl; $database="genbank"; $format="fasta"; $pipe ="\\|"; $space = " "; open(INPUTFILE, "<test.txt"); while(<INPUTFILE>) { my($line) = $_; chomp($line); $line=~ s/$space/:/; $line=~ s/$pipe/$space/; $line=~ s/g/G/; $line=~ s/i/I/; $id= "$line"; #print "$id"; #print "\n"; $sequence = get_sequence($database, $id); $test = write_sequence( ">>sequences_1.txt", $format, $sequence); open (CHK , ">>checking.txt"); print CHK <<HERE; $test HERE close CHK; } exit;

Thanks

Replies are listed 'Best First'.
Re: more then 2000 file download from database error.
by marto (Cardinal) on May 19, 2012 at 13:56 UTC

    I suspect you've hit a limit imposed by their database, the documentation suggest you download the database locally so that you're not hammering theirs.

    "Another example is the ability to blast a sequence using the facilities as NCBI. Please be careful not to abuse the compute that NCBI provides and so use this only for individual searches. If you want to do a large number of BLAST searches, please download the blast package locally."

Re: more then 2000 file download from database error.
by sauoq (Abbot) on May 19, 2012 at 14:02 UTC

    Do you have a question?

    My guess is that the error is coming from GenBank.

    -sauoq
    "My two cents aren't worth a dime.";