Hi!
I am using bioperl to parse a blast report. I am having
difficulty parsing only the best hsp(high scoring pair)info
as well as the alignment. I currently can get
the alignment to print(clustalW format) but not the
other info concerning the hsp to an out file.

#!/usr/bin/perl use warnings; use strict; use Bio::SearchIO; use Bio::AlignIO; use strict; my $out = "C:/Documents and Settings/mydir/Desktop/current/bl2seq/1311 +7_aln.out"; my $dir = 'C:/Documents and Settings/mydir/Desktop/current/bl2seq/'; my $in = new Bio::SearchIO(-format => 'blast', -file => $dir.'13117_2.out'); while( my $result = $in->next_result ) { while( my $hit = $result->next_hit ) { while( my $hsp = $hit->next_hsp ) { if( $hsp->length('total') > 1000 ) { if ( $hsp->percent_identity >= 90 ) { if( $hsp->score ){ if( $hsp->gaps ){ if( $hsp->expect ){ print "Hit= ", $hit->name, " Length=", $hsp->length('total'), " Score=", $hsp->score,"\n"; print "Gaps=", $hsp->gaps, " Expect=", $hsp->expect, " Percent_id=", $hsp->percent_identity, "\n"; my $aln = $hsp->get_aln; my $alnIO = Bio::AlignIO->new(-format =>"clustalw", -file +=> $out); $alnIO->write_aln($aln); } } } } } } } } __DATA__ Query= 13117 Mus musculus genomic with Cassette inserted (10,084 letters) >13117/13117_fasta.Contig1 Length = 1453 Score = 2046 bits (1032), Expect = 0.0 Identities = 1088/1100 (98%), Gaps = 5/1100 (0%) Strand = Plus / Minus + Query: 1001 taggaaccattttttttttttct-gtgtttccttca-tcctgtgactctgccaacc-a +ac 1057 ||||||||||||||||||||| | |||||||||||| |||||||||||| |||||| | +|| Sbjct: 1113 taggaaccatttttttttttttttgtgtttccttcaatcctgtgactctaccaaccca +ac 1054 + Query: 1058 tttctaatccctctacctgcaaactacaaactctggtctgccagcaggttatgaattg +ta 1117 |||||||||| ||||||||||||||||||||||||||||||||||||||||| |||| +|| Sbjct: 1053 tttctaatccttctacctgcaaactacaaactctggtctgccagcaggttataaattt +ta 994 + Query: 1118 aagaaatttcattatgcttttgtctgcaaaatcccccagttctcagaaacaggg-tta +aa 1176 |||||||||||||||||||||||||||||||||||||||||||||||||||||| ||| +|| Sbjct: 993 aagaaatttcattatgcttttgtctgcaaaatcccccagttctcagaaacaggggtta +aa 934 + Query: 1177 gccaatgtgtagagaagaatgaacaaaa-tgcagaaatgcagggaatcccagtgcctc +at 1235 ||||||| |||||||||||||||||||| |||||||| |||||||||||||||||||| +|| Sbjct: 933 gccaatgggtagagaagaatgaacaaaaatgcagaaaagcagggaatcccagtgcctc +at 874 + Query: 1236 ggggtgggggtggggtgggagtgcagggtcagcagggaccatccagatgctgggacac +cc 1295 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 873 ggggtgggggtggggtgggagtgcagggtcagcagggaccatccagatgctgggacac +cc 814 + Query: 1296 ctgtgaaatatcacaccctctctgccccgtgctgtcctggattcaggtcaaactggat +tt 1355 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 813 ctgtgaaatatcacaccctctctgccccgtgctgtcctggattcaggtcaaactggat +tt 754 + Query: 1356 caatctgcataatttgtcccatttagaagggttgggagagggtgagaacaggagggtg +aa 1415 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 753 caatctgcataatttgtcccatttagaagggttgggagagggtgagaacaggagggtg +aa 694 + Query: 1416 gactgaggaggggggatagatactccccccatcccaaaggcaggcatgaagtagtggc +ca 1475 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 693 gactgaggaggggggatagatactccccccatcccaaaggcaggcatgaagtagtggc +ca 634 + Query: 1476 gaggaaaccagagtgaacaggctggagtgaaggctggtagctgctagagagaggaggt +gg 1535 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 633 gaggaaaccagagtgaacaggctggagtgaaggctggtagctgctagagagaggaggt +gg 574 + Query: 1536 gaccgggaggagcagcaggcttgaaaagcacagggtaattgtcagcccagcactgtag +gg 1595 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 573 gaccgggaggagcagcaggcttgaaaagcacagggtaattgtcagcccagcactgtag +gg 514 + Query: 1596 ggcaatttagtctacttacccagtggcaagaagtaaaaaaaaagtgggctgtgactta +aa 1655 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 513 ggcaatttagtctacttacccagtggcaagaagtaaaaaaaaagtgggctgtgactta +aa 454 + Query: 1656 gatcttaagactcactagagagggaggggttagcctgggagtgggaggagggagtgca +ga 1715 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 453 gatcttaagactcactagagagggaggggttagcctgggagtgggaggagggagtgca +ga 394 + Query: 1716 gtttaggaggagacagagactctggagggtgcaaggaagtgtctttagctgagctggg +ct 1775 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 393 gtttaggaggagacagagactctggagggtgcaaggaagtgtctttagctgagctggg +ct 334 + Query: 1776 gggctgggacgggaaggtagactgaagctgcctgaactgctcatcccgctgggtctga +gc 1835 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 333 gggctgggacgggaaggtagactgaagctgcctgaactgctcatcccgctgggtctga +gc 274 + Query: 1836 tcccagtggggcggttcgcatgcctggaaggactctaagcaccctcagacctcttact +ct 1895 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 273 tcccagtggggcggttcgcatgcctggaaggactctaagcaccctcagacctcttact +ct 214 + Query: 1896 gaggttcaagccaatccttcagggtgcttgctgagttggtgacccatctctggaactg +gt 1955 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 213 gaggttcaagccaatccttcagggtgcttgctgagttggtgacccatctctggaactg +gt 154 + Query: 1956 gtccctgggactcactgcccagcttcacttcattcagcaccatgggtaccgatttaaa +tg 2015 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 153 gtccctgggactcactgcccagcttcacttcattcagcaccatgggtaccgatttaaa +tg 94 + Query: 2016 atccagtggtcctgcagaggagagattgggagaatcccggtgtgacacagctgaacag +ac 2075 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| +|| Sbjct: 93 atccagtggtcctgcagaggagagattgggagaatcccggtgtgacacagctgaacag +ac 34 Query: 2076 tagccgcccaccctcccttt 2095 |||||||||||||||||||| Sbjct: 33 tagccgcccaccctcccttt 14 Score = 40.1 bits (20), Expect = 1e-005 Identities = 20/20 (100%) Strand = Plus / Minus Query: 7698 ggggtgggggtggggtggga 7717 |||||||||||||||||||| Sbjct: 873 ggggtgggggtggggtggga 854 Score = 28.2 bits (14), Expect = 0.046 Identities = 14/14 (100%) Strand = Plus / Minus Query: 8817 catttttttttttt 8830 |||||||||||||| Sbjct: 1106 catttttttttttt 1093 Score = 26.3 bits (13), Expect = 0.18 Identities = 13/13 (100%) Strand = Plus / Plus Query: 363 cccctccctctct 375 ||||||||||||| Sbjct: 425 cccctccctctct 437 Score = 26.3 bits (13), Expect = 0.18 Identities = 13/13 (100%) Strand = Plus / Minus Query: 7649 agggtgagaacag 7661 ||||||||||||| Sbjct: 715 agggtgagaacag 703 Score = 24.3 bits (12), Expect = 0.71 Identities = 12/12 (100%) Strand = Plus / Plus Query: 5507 cccccctcctca 5518 |||||||||||| Sbjct: 679 cccccctcctca 690 Score = 24.3 bits (12), Expect = 0.71 Identities = 12/12 (100%) Strand = Plus / Minus Query: 8819 tttttttttttt 8830 |||||||||||| Sbjct: 1101 tttttttttttt 1090 Score = 24.3 bits (12), Expect = 0.71 Identities = 12/12 (100%) Strand = Plus / Minus Query: 8819 tttttttttttt 8830 |||||||||||| Sbjct: 1102 tttttttttttt 1091 Score = 24.3 bits (12), Expect = 0.71 Identities = 12/12 (100%) Strand = Plus / Minus Query: 8819 tttttttttttt 8830 |||||||||||| Sbjct: 1103 tttttttttttt 1092 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Minus Query: 598 tcctgcagagg 608 ||||||||||| Sbjct: 84 tcctgcagagg 74 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Minus Query: 4644 gcctgaactgc 4654 ||||||||||| Sbjct: 304 gcctgaactgc 294 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Minus Query: 4040 ctgcctgaact 4050 ||||||||||| Sbjct: 306 ctgcctgaact 296 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Minus Query: 9957 ttagctgagct 9967 ||||||||||| Sbjct: 349 ttagctgagct 339 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Plus Query: 9036 ctgtctcctcc 9046 ||||||||||| Sbjct: 378 ctgtctcctcc 388 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Plus Query: 7830 tcctcccactc 7840 ||||||||||| Sbjct: 405 tcctcccactc 415 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Minus Query: 368 ccctctctgcc 378 ||||||||||| Sbjct: 798 ccctctctgcc 788 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Plus Query: 9047 cccaccccacc 9057 ||||||||||| Sbjct: 855 cccaccccacc 865 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Plus Query: 9045 cccccacccca 9055 ||||||||||| Sbjct: 864 cccccacccca 874 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Plus Query: 8608 tttctgagaac 8618 ||||||||||| Sbjct: 945 tttctgagaac 955 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Plus Query: 564 aaaaaaaaaat 574 ||||||||||| Sbjct: 1095 aaaaaaaaaat 1105 Score = 22.3 bits (11), Expect = 2.8 Identities = 11/11 (100%) Strand = Plus / Minus Query: 7639 tttgttaagaa 7649 ||||||||||| Sbjct: 1436 tttgttaagaa 1426 Lambda K H 1.37 0.711 1.31 Gapped Lambda K H 1.37 0.711 1.31 Matrix: blastn matrix:1 -3 Gap Penalties: Existence: 5, Extension: 2 Number of Sequences: 1 Number of Hits to DB: 562 Number of extensions: 37 Number of successful extensions: 37 Number of sequences better than 10.0: 1 Number of HSP's gapped: 24 Number of HSP's successfully gapped: 21 Length of query: 10084 Length of database: 1453 Length adjustment: 12 Effective length of query: 10072 Effective length of database: 1441 Effective search space: 14513752 Effective search space used: 14513752 X1: 11 (21.8 bits) X2: 15 (29.7 bits) X3: 50 (99.1 bits) S1: 11 (22.3 bits) S2: 11 (22.3 bits)

I am only interested in parsing out the first hit, which is
the best hit with greater than 98% identity. Any ideas?

In reply to parsing blast output by lomSpace

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.