It appears that what you are after is a relation among the nodes such that two nodes are related if one overlays the other, directly or indirectly (or if the nodes are identical). Direct overlay is what is stated in the data. The indirect relations can be described as the transitive closure of the relation of immediate overlays. The desired chains would be the equivalence classes of this extended relationship.

Fortunately, there is a module (by Abigail) that implements transitive closure. It is the work-horse of the following solution:

#!/usr/local/bin/perl use strict; use warnings; $| = 1; use Vi::QuickFix; use Algorithm::Graphs::TransitiveClosure qw( floyd_warshall); my %graph; my %nodes; while (<DATA>) { my @next = split; @nodes{ @next} = (); while ( @next > 1 ) { # build refexive and symmetric relation my $top = shift @next; my $below = $next[ 0]; $graph{ $top}->{ $top} = 1; $graph{ $top}->{ $below} = 1; $graph{ $below}->{ $top} = 1; } } # build transitive closure floyd_warshall( \ %graph); # pull off chains my @chains; while ( keys %nodes ) { my $node = each %nodes; my @chain = sort { $a <=> $b } keys %{ $graph{ $node} }; push @chains, \ @chain; delete @nodes{ @chain}; } print "@$_\n" for sort { $a->[ 0] <=> $b->[ 0] } @chains; exit; __DATA__ 1 27 1 32 1 68 ...
The %graph hash is built much like %overlays in your code, except that both ascending and descending pairs are considered related. After the Floyd-Marshall algortithm has built the closure, the groups can be pulled off one by one.

The resulting output

1 5 6 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 +30 31 32 33 35 36 37 38 39 40 43 44 45 46 47 48 49 50 52 53 54 56 57 +58 61 65 66 68 72 73 74 75 80 82 84 87 88 89 90 91 92 93 94 95 96 97 +98 99 100 101 102 103 104 106 107 108 109 110 111 112 113 114 116 117 + 119 120 121 122 123 124 125 126 127 129 130 131 132 134 136 137 139 +140 141 145 146 147 148 149 150 151 152 153 154 155 158 159 160 161 1 +62 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 17 +9 180 181 182 183 184 185 186 187 188 189 191 192 193 194 195 196 197 + 198 199 200 201 777 888 999 4 42 51 76 85 86 133 666 34 63 59 71 60 64 67 62 81 83 115 69 78 105 118
shows indeed only one large group which contains the spurious one that appeared in your output.

Update: Come to think of it, what you're doing is determine the connected components of a graph. There are algorithms (and CPAN modules) that implement that directly. Using Floyd-Warshall in this way is possible, but probably less than optimal.

Anno


In reply to Re: Determine group membership for partial sequences by Anno
in thread Determine group membership for partial sequences by GrandFather

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.