Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I am trying to use LWP::Simple to get sequences from a website. The web address for each sequence is always the same except for the sequence id at the end. I am only retrieving 537/603 sequences with the below code.
Obviously it seems that some of my sequences can't be found. However, if it can't find a sequence I am trying to print an error message to say so in the output - this is the part that i can't get to work.
I wondered if someone could suggest how I can print an error message if the sequence can't be found (e.g. web site is empty or doesn't exist). Thanks!
use strict; use LWP::Simple; my @fasta; my $o; # Note: @accession defined earlier has 603 elements for (my $i=0; $i<@accession; $i++) { if ($accession[$i] =~ /^(\w+)\_(\w+)\_(\w+)\;/) { #print "$accession[$i]\n"; if ($1 eq $3) { # USE $1 AS THE ACCESSION NUMBER my $seq = get "http://us.expasy.org/cgi-bin/get-sprot- +fasta?$1"; #print "$seq\n"; push (@fasta, $seq); push @fasta, "> COULDN'T FIND IT" unless defin +ed $seq; } else { # # USE $3 AS THE ACCESSION NUMBER my $seq = "http://us.expasy.org/cgi-bin/get-sprot-fas +ta?$3"; push (@fasta, $seq); push @fasta, "> COULDN'T FIND IT" unless defin +ed $seq; } } } #print "$o\n"; # fasta only has 537 / 603 elements print "@fasta\n";

Replies are listed 'Best First'.
Re: LWP::Simple problems
by chromatic (Archbishop) on Jan 29, 2005 at 16:31 UTC

    I think this code is shorter but still does what you want.

    use strict; my @fasta; for my $acc (@accession) { next unless /_(\d+);/ my id = $1; my $seq = get( "http://us.expasy.org/cgi-bin/get-sprot-fasta?$id"; unless (defined $seq) { print "Couldn't find id $id\n"; next; } push @fasta, $seq; }

      Don't forget to sleep briefly between page hits.


      Dave

        If its anything like the standard expasy site, sleep forever!,
        you should read the robots.txt file, as this aint allowed,
        that said, i cant check the us mirror at the moment , perhaps you've DOS'ed it!


        I should really do something about this apathy ... but i just cant be bothered
Re: LWP::Simple problems
by ZlR (Chaplain) on Jan 29, 2005 at 13:16 UTC
    Hello,

    i think this is not good :

    push (@fasta, $seq); push @fasta, "> COULDN'T FIND IT" unless defined $seq;
    It will push $seq even if it's not defined.
    Use a full if/else statement :
    if (defined $seq) { push @fasta, $seq } else { push @fasta, "> COULDN'T FIND IT" }
    Or you can use the short A?B:C construct which does the same thing :
    defined $seq ? push @fasta, $seq : push @fasta, "> COULDN'T FIND IT" ;

    As a note i think the declaration for $seq should be up there with the other my variables. And also $o is not a very good name :)

    Hope this helps,
    ZlR.

      defined $seq ? push @fasta, $seq : push @fasta, "> COULDN'T FIND IT" ;

      You can save a few (meagre) keystrokes and make the statement more readable (IMO):

      push( @fasta, defined($seq) ? $seq : "> COULDN'T FIND IT" );
Re: LWP::Simple problems
by foil (Sexton) on Jan 29, 2005 at 17:24 UTC
    Error response codes are all built into the module . I quote " This module also exports the HTTP::Status constants and procedures. You can use them when you check the response code from getprint(), getstore() or mirror(). " Using this can give you more info if request failed ...