in reply to Re: Aligning sequence
in thread Aligning sequence

hello all, here I have attached my code and the result am getting with the case of Blast
use strict; use Bio::SeqIO; use Bio::Tools::Run::StandAloneBlast; my %srna; my %genome; my $seqio_srna = Bio::SeqIO->new(- file=>'/home/jayakuma/script/graphi +cs/JH_TCV_lateSL.clones.filtered.23_1_2008.fa', -format=>'fasta') or die "cannot open file\n"; my $seqio_large = Bio::SeqIO->new(-file => '/home/jayakuma/script/grap +hics/TCV-jagger.fna', -format =>'fasta') or die " cannot opne the file\n"; while (my $large_seq = $seqio_large->next_seq()){ my $id = $large_seq->display_id; my $seq = $large_seq->seq; $genome{$id} = $seq; while (my $seq_srna = $seqio_srna->next_seq()){ my $display_id = $seq_srna->display_id; my $seq = $seq_srna->seq; $srna{$display_id} = $seq; #rint "seq: $seq\n"; } foreach my $seq_id (keys %srna){ my $srnas = $srna{$seq_id}; print "srnas: $srnas\n"; foreach($srnas){ my $blast = Bio::Tools::Run::StandAloneBlast->new(program=>'blastn +',database=>'/home/jayakuma/script/graphics/TCV-jagger.fna'); my $input = Bio::Seq->new(-seq=> $srnas); print $input->seq, "\n"; my $blast_report = $blast->blastall($input); while (my $result = $blast_report->next_result()){ my $query_length = $result->query_length(); while (my $hit = $result->next_hit()){ my $id = $hit->accession(); while (my $hsp = $hit->next_hsp()){ if($hsp->frac_identical ==1 && $hsp ->length ==$query_leng +th){ print "$srnas\t$id\n"; } } } } } } }
The error that am getting with this case is here;
srnas: TTTGCAGTATTGGACAAGCC TTTGCAGTATTGGACAAGCC Use of uninitialized value in pattern match (m//) at /usr/share/perl5/ +Bio/SeqIO/fasta.pm line 193, <GEN0> line 81643. Use of uninitialized value in print at /usr/share/perl5/Bio/Root/IO.pm + line 407, <GEN0> line 81643. -------------------- WARNING --------------------- MSG: cannot find path to blastall --------------------------------------------------- Can't call method "next_result" on an undefined value at aligning.pl l +ine 34, <GEN0> line 81643.
Its dying at blastall and thus obviously couldn't get any results.
I have two files one the large file which contails 4KB(size of sequences)
And another File that contains set of small sequences (which may be around 4000 in number)
the large sequence's path is given as the database. Any help please, thanks in advance

Replies are listed 'Best First'.
Re^3: Aligning sequence
by igelkott (Priest) on Feb 28, 2008 at 20:35 UTC
    May just be a typo but on line 6, "- file" should be "-file".

    From the first error message, I'd guess that this was an input format error. Little things like spaces in the fasta ID could kill the parser. Check your input around line 81643. If this script works on a small portion of the input, try with a region around the error.

    The second error message could just be an artifact because it really should know where "blastall" is kept. Just to make sure, check $blast->executable('blastall').

      hi there I checked with both the stuffs you have mentioned. The last one the executable('blastall') didn't cause any errors. And as you have suggested shrinking the file size din't do any good, because it results in the same error but with different line numbers. I even tried with just a single seqeuence and even that results in
      CAGCGATGGGGATCAAGCTC Use of uninitialized value in pattern match (m//) at /usr/share/perl5/ +Bio/SeqIO/fasta.pm line 193, <GEN0> line 2. Use of uninitialized value in print at /usr/share/perl5/Bio/Root/IO.pm + line 407, <GEN0> line 2. -------------------- WARNING --------------------- MSG: cannot find path to blastall ---------------------------------------------------
      I have also included a bit of my input sequence:
      >EH0MKSX01A0000.1 CAGCGATGGGGATCAAGCTC >EH0MKSX01A006U.1 ATTTGATAAAGCCATCGGAGGCTT >EH0MKSX01A00BH.1 CAGCCGAGGGACCCACGATAC >EH0MKSX01A00O3.1 CCAGCACGTATCCGTGGACGC >EH0MKSX01A00VX.1 CGGATAGCGGGGCGGATATAGAT >EH0MKSX01A0133.1 AAGATGCGTCGAACCTTCGGGG >EH0MKSX01A01AH.1 GCGTATGAGGAGCCATGCAT >EH0MKSX01A01AK.1 AATACAAGAGTAGCTAAGTTGTCC >EH0MKSX01A01FV.1 CGATCCAATGATGCAGCCTT >EH0MKSX01A01IQ.1 CAGGAGGAGAGTTCGTCAAA >EH0MKSX01A01M9.1 AAACAGACCGCCCGCGCAGCG >EH0MKSX01A01MX.1 ATGGTGAAGGGTGGGTCATGGT >EH0MKSX01A01PQ.1 ATGATGCCGCCCAACTCGGTGA >EH0MKSX01A01SX.1 CCCATGACTGGGAGGTCGTGGT >EH0MKSX01A01WT.1 AGGGGCGGTTTGCCATGCATGC
      Sorry if thats to silly but I couldn't spot it.. getting nuts of this:(
      any help please, thanks,
        I disagree. Since the script doesn't work with a single sequence, my third suggestion is most likely. An error like "MSG: cannot find path to blastall" seems to be the main and possibly only problem.

        Suggested fix: set the environmental variable $BLASTDIR or your $PATH variable to point to the BLAST directory.