It wouldn't be easy to accumulate the item numbers, because the data structure for UnionFind keeps changing as I go. At any given point, I can find out to which partition a vertex belongs to by looking it up with "find". So the easiest way to get the partitions is to gather them up at the end.
However, I am still seeing a bug. Some of my "one-item" partitions seem to have shared keys with other partitions. Cases near the beginning of the item set (like "iaac") tend to have this problem. Trying to gather the items into partitions as I go along might help me find the bug.
The purpose of this pass is to find all the completely distinct partitions. There's no point in looking for n-ary unions among things that have no bits in common. So I would apply the original algorithm to each subset.
This pass will also help halley decide if the keywords are too generic. If it's all one big clump, there may be too many common words in the set.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.