in reply to Re^3: Stuck in my final step of code using array of arrays
in thread Stuck in my final step of code using array of arrays

I think ++Cristoforo has provided the appropriate changes required (in Re^4: Stuck in my final step of code using array of arrays).

Now that I see another example of input and expected output, I suspect 'none' is incorrect for either the start or the end of the range. I originally used this (in Re: Stuck in my final step of code using array of arrays) based on your description containing "... before and after them (if any) ..." in the OP.

Here's another script, that uses virtually the same changes as Cristoforo supplied, but replaces 'none' with the values I think you want. I've included additional test data to cover the four cases with and without codes before and after the special code.

#!/usr/bin/env perl use strict; use warnings; my %special = (PF03797 => 1); { local $/ = "//\n"; while (<DATA>) { my ($id) = /^ID:(\w+)/; my @data; while (/HIT:(\w+).*?SEQ_START:(\d+).*?(\d+)/g) { push @data, [ $1, $2, $3 ]; } @data = sort { $a->[2] <=> $b->[2] } @data; for my $i (0 .. $#data) { if ($special{$data[$i][0]}) { my $start = $i == 0 ? $data[$i][1] : $data[$i - 1 +][2] + 1; my $end = $i == $#data ? $data[$i][2] : $data[$i + 1 +][1] - 1; printf "%-41s %7s %4d %4d\n" => $id, $data[$i][0], $st +art, $end; } } } } __DATA__ ID:A0AWZ5_1___codes_before_only HIT:PF12951 SCORE:40.0 EVALUE:2.2e-10 HMM_START:2 HMM_END:32 SEQ_ST +ART:421 SEQ_END:455 HIT:PF03797 SCORE:130.7 EVALUE:3.6e-40 HMM_START:7 HMM_END:261 SEQ_ST +ART:822 SEQ_END:1073 HIT:PF12951 SCORE:38.7 EVALUE:5.5e-10 HMM_START:1 HMM_END:32 SEQ_ST +ART:515 SEQ_END:547 // ID:A0AWZ5_2___codes_before_and_after HIT:PF12951 SEQ_START:120 SEQ_END:350 HIT:PF03797 SEQ_START:822 SEQ_END:1073 HIT:PF15789 SEQ_START:1515 SEQ_END:1547 HIT:PF00267 SEQ_START:1200 SEQ_END:1350 // ID:A0AWZ5_3___codes_after_only HIT:PF03797 SEQ_START:822 SEQ_END:1073 HIT:PF15789 SEQ_START:1515 SEQ_END:1547 HIT:PF00267 SEQ_START:1200 SEQ_END:1350 // ID:A0AWZ5_4___codes_neither_before_nor_after HIT:PF03797 SEQ_START:822 SEQ_END:1073 //

Output:

A0AWZ5_1___codes_before_only PF03797 548 1073 A0AWZ5_2___codes_before_and_after PF03797 351 1199 A0AWZ5_3___codes_after_only PF03797 822 1199 A0AWZ5_4___codes_neither_before_nor_after PF03797 822 1073

-- Ken

Replies are listed 'Best First'.
Re^5: Stuck in my final step of code using array of arrays
by Anonymous Monk on Mar 04, 2014 at 10:20 UTC
    Many thanks to both of you, greatly appreciated!!