SAMPLE QUERY: ID5141.C1665 ID5141.C2448 ID5141.C1253 ID5144.C2039 ID5141.C1596 ID5144.C1956 ID5141.C1906 ID5144.C2149 ID5141.C1221 ID5144.C1956 ID5141.C2149 ID5141.C2386 ID5141.C2039 ID5142.C1221 ID5141.C5887 ID5141.C7685 ID5141.C1005 ID5142.C2808 ID5141.C1046 ID5141.C1596 ID5141.C2386 ID5141.C4990 ID5141.C7685 ID5141.C4888 ---------------------------- SAMPLE RESULT: cluster1 ID5141.C1665 ID5141.C2448 cluster2 ID5141.C1253 ID5144.C2039 ID5141.C2039 ID5142.C1221 ID5141.C1221 ID5144.C1956 ID5141.C1596 ID5144.C1956 ID5141.C1046 ID5141.C1596 cluster3 ID5141.C1906 ID5144.C2149 ID5141.C2149 ID5141.C2386 ID5141.C2386 ID5141.C4990 cluster4 ID5141.C1005 ID5141.C2808 cluster5 ID5141.C5887 ID5141.C7685 ID5141.C7685 ID5141.C4888 ----------------------------
Problem: Each line has a pair of ID's. One of the Id's in every pair may or may not match with other ID's in other lines. If the ID does not match with any other ID's of other lines, it is treated as a separate cluster (example - cluster1 and cluster4 in result file) because the ID's in these clusters are not available else where in the file.But, If one of the ID's from one line matches with any other ID (a 'MATCH') in another line, the clustering starts. A similar ID for the 'MATCH' is looked for in the other lines and continues to find 'MATCH'es until it stops without a 'MATCH' (example - cluster2 , cluster3, and cluster5 in result file). I have been trying to find a way to do this, but in vain. I am a beginner in perl, n i know only to deal with arrays to some extent. Do i need to use hash for this problem? will it help me to solve the problem? Please guide me with the shortest or the simplest way to deal with this problem. The program which i wrote n got stuck in the middle is this. Dont know how to proceed actually :(
#!/usr/bin/perl open(FH,"sample_query") or die "can not open"; @array=<FH>; for($i=0;$i<$#array;$i++) { if($array[$i]=~/$array[$i]/) { $mark=$array[$i]; if($mark=~/$array[$i]/) { print "$array[$i]\n"; } } }
Please guide how to proceed!!!

In reply to clustering pairs by sugar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.