I think ++Cristoforo has provided the appropriate changes required (in Re^4: Stuck in my final step of code using array of arrays).

Now that I see another example of input and expected output, I suspect 'none' is incorrect for either the start or the end of the range. I originally used this (in Re: Stuck in my final step of code using array of arrays) based on your description containing "... before and after them (if any) ..." in the OP.

Here's another script, that uses virtually the same changes as Cristoforo supplied, but replaces 'none' with the values I think you want. I've included additional test data to cover the four cases with and without codes before and after the special code.

#!/usr/bin/env perl use strict; use warnings; my %special = (PF03797 => 1); { local $/ = "//\n"; while (<DATA>) { my ($id) = /^ID:(\w+)/; my @data; while (/HIT:(\w+).*?SEQ_START:(\d+).*?(\d+)/g) { push @data, [ $1, $2, $3 ]; } @data = sort { $a->[2] <=> $b->[2] } @data; for my $i (0 .. $#data) { if ($special{$data[$i][0]}) { my $start = $i == 0 ? $data[$i][1] : $data[$i - 1 +][2] + 1; my $end = $i == $#data ? $data[$i][2] : $data[$i + 1 +][1] - 1; printf "%-41s %7s %4d %4d\n" => $id, $data[$i][0], $st +art, $end; } } } } __DATA__ ID:A0AWZ5_1___codes_before_only HIT:PF12951 SCORE:40.0 EVALUE:2.2e-10 HMM_START:2 HMM_END:32 SEQ_ST +ART:421 SEQ_END:455 HIT:PF03797 SCORE:130.7 EVALUE:3.6e-40 HMM_START:7 HMM_END:261 SEQ_ST +ART:822 SEQ_END:1073 HIT:PF12951 SCORE:38.7 EVALUE:5.5e-10 HMM_START:1 HMM_END:32 SEQ_ST +ART:515 SEQ_END:547 // ID:A0AWZ5_2___codes_before_and_after HIT:PF12951 SEQ_START:120 SEQ_END:350 HIT:PF03797 SEQ_START:822 SEQ_END:1073 HIT:PF15789 SEQ_START:1515 SEQ_END:1547 HIT:PF00267 SEQ_START:1200 SEQ_END:1350 // ID:A0AWZ5_3___codes_after_only HIT:PF03797 SEQ_START:822 SEQ_END:1073 HIT:PF15789 SEQ_START:1515 SEQ_END:1547 HIT:PF00267 SEQ_START:1200 SEQ_END:1350 // ID:A0AWZ5_4___codes_neither_before_nor_after HIT:PF03797 SEQ_START:822 SEQ_END:1073 //

Output:

A0AWZ5_1___codes_before_only PF03797 548 1073 A0AWZ5_2___codes_before_and_after PF03797 351 1199 A0AWZ5_3___codes_after_only PF03797 822 1199 A0AWZ5_4___codes_neither_before_nor_after PF03797 822 1073

-- Ken


In reply to Re^4: Stuck in my final step of code using array of arrays by kcott
in thread Stuck in my final step of code using array of arrays by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.