in reply to Re^4: algorithm for 'best subsets'
in thread algorithm for 'best subsets'
Update: Okay. I hadn't seen your updated code when I wrote this. One question though. Couldn't you just accumulate the item numbers involved in each partition as you go, rather than rediscovering them afterwards?
Then I still don't understand what the resultant dataset in %union_data is?
The keys of the hash are a subset of the items.
But how does one item number represent a partition of the total items?
The values are a bitmap representing a set of keywords.
I think I understand that the value bitmaps represent an inclusive OR (union) of all the keywords found in a given partition, that have no intersection with any of the keywords in any other of the partitions? Is that correct?
But there is no quick way(?) to determine which items are in each partition As each 'item number' key represents it's partition, not identifies it.
And each value (keyword bitmap) is also composite, so there is no way back to the individual item/keywords sets(?) in order to do the n-ary unions, there either.
Don't get me wrong, this is a quicker approach than the one I was persuing--reducing the n-ary unions by excluding those that didn't share at least n keywords in a pairwise union--but I'm struggling to see how you go forward from where your code leaves off?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^6: algorithm for 'best subsets'
by tall_man (Parson) on Mar 04, 2005 at 22:05 UTC | |
by BrowserUk (Patriarch) on Mar 04, 2005 at 23:27 UTC | |
by tall_man (Parson) on Mar 04, 2005 at 23:35 UTC | |
by BrowserUk (Patriarch) on Mar 05, 2005 at 00:09 UTC | |
by BrowserUk (Patriarch) on Mar 08, 2005 at 01:19 UTC |