Hi Guys, thanks for getting back to me so quickly. I figured I would elucidate some more before I tried any of your suggestions. First of all, the idea is that these are all strains of the same species of bacteria. We have a clustering algorithm that goes through a set of the different strains, finds which genes in all strains are common (i.e. these are "core" genes) and then finds all the genes which are distributed (not unique to any one bacteria, but not ubiquitous throughout all strains). My program counts, for each strain, the number of distributed genes it has in common with any other particular strain. I divide that number by the total number of distributed genes and then subtract by one to find the "distance" between the strains.
After that the bacteria are then grouped together by the nearest lying other bacteria. Groups consist of bacteria that all share at least one common nearest neighboring bacteria. I have that pretty much under control. However, now that I have a Hash that for each group has a list of these nearest neighboring bacteria, but in some cases they overlap. Say, Bacteria A's nearest neighbor is D so it goes to group one. B is nearest neighbor to C, so it goes into its own group two. But then D is nearest neighbor to B, so D and B go into group one, but B is also part of group 2, so those two first groups should just be one.
Basically at this point the only decision that needs to be made is a binary "are any strains in group1 also in any other group? if so, merge the groups".
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.