You have almost arrived at yet another solution to your problem. You have made good progress from your initial posting.
I agree with GrandFather's comments.
Also, note the warning regarding modification of the list a foreach loop is iterating over, in perlsyn
If any part of LIST is an array, "foreach" will get very confused if you add or remove elements within the loop body, for example with "splice". So don’t do that.
Your "ID" is not compatible with the clusters you gave in your original post, which will only be produced if you ignore the characters preceding and including the period in each half of the string.
These issues are addressed somewhat in the following:
#!/usr/bin/perl use strict; use warnings; my @arr = <DATA>; chomp @arr; local $,="\n"; while(@arr) { my @str = shift @arr; my @reslt; push(@reslt,$str[0]); while (@str) { my $str = shift @str; my (undef, $s1, undef, $s2, $flag) = split(/[ \.]/,$str); my $count = 0; my $acount = 0; #to arrange o/p while($count < @arr) { my $strtocheck = $arr[$count]; if($strtocheck =~ /$s1|$s2/) { $acount++; if($acount == 2 || ($flag && $flag == 1)) { unshift(@reslt,$strtocheck); unshift(@str,$strtocheck." 1"); } else { push(@reslt,$strtocheck); push(@str,$strtocheck); } splice(@arr,$count,1); } else { $count++; } } } print @reslt,"\n"; } __DATA__ ID5141.C1665 ID5141.C2448 ID5141.C1253 ID5144.C2039 ID5141.C1596 ID5144.C1956 ID5141.C1906 ID5144.C2149 ID5141.C1221 ID5144.C1956 ID5141.C2149 ID5141.C2386 ID5141.C2039 ID5142.C1221 ID5141.C5887 ID5141.C7685 ID5141.C1005 ID5142.C2808 ID5141.C1046 ID5141.C1596 ID5141.C2386 ID5141.C4990 ID5141.C7685 ID5141.C4888
In reply to Re^3: clustering pairs
by ig
in thread clustering pairs
by sugar
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |