Hello everyone. I am new to programing and new to PERL as well of course. I needed to write a script to extract some information from a large sized file. My file looks like

# BLASTP 2.2.28+ # Query: gi|338220664|gb|EGP06123.1| hypothetical protein GEW_00005 [P +asteurella multocida subsp. gallicida str. Anand1_poultry] # Database: nr-25sep # Fields: query id, subject id, % identity, alignment length, mismatch +es, gap opens, q. start, q. end, s. start, s. end, evalue, bit score # 2 hits found gi|338220664|gb|EGP06123.1| gi|45383702|ref|NP_989542.1| 45.15 + 206 96 7 3 204 28 220 1e-51 170 gi|338220664|gb|EGP06123.1| gi|15419940|gb|AAK97214.1| 44.17 +206 98 7 3 204 28 220 5e-50 166 # BLASTP 2.2.28+ # Query: gi|338220666|gb|EGP06125.1| hypothetical protein GEW_00015 [P +asteurella multocida subsp. gallicida str. Anand1_poultry] # Database: nr-25sep # 0 hits found # BLASTP 2.2.28+ # Query: gi|338220651|gb|EGP06111.1| hypothetical protein GEW_00275 [P +asteurella multocida subsp. gallicida str. Anand1_poultry] # Database: nr-25sep # 0 hits found

I basically want to extract the "query line" particularly the number after "gi" in the query line only of those that have 0 hits. So in this case my matching line would be "# 0 hits found". I have wrote a small script which extract the matching line but i am unable to extract the query line and the number after gi in the query line. My code is

sub getGI { open(FILE, "twentySeq1e-10.out") or die("Cannot open file"); while(<FILE>) { my $line = $_; if($line=~/# 0 hits found/) { print "$line\n"; } } }

The desired output which I want is the number after "gi" in the query line, only of those having 0 hits. For example in this case the output would be

338220666 338220651

The "query" line is 2line before the matching line. If some one could help me with this I would be grateful. Thanks


In reply to Print a previous to previous of a matching line by ag88

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.