in reply to Better algorithm or data structure?

I'm more than a little baffled... First, I'm not sure how to interpret this description:
Essentially, the state of the booleans in @bools will switch from false to true in some undefinable order. Each time one becomes true, it will affect the composite state of one or more sets in @sets. And if all the booleans referenced by any particular set are true, then I need to remove that set from @sets.

So, is it important that a set be removed as soon as the related booleans all turn true? Or could it instead be a two-step process: stuff happens to the booleans, and when that's all done, then you check the sets to see which ones should be removed?

I'm puzzled because the OP code seems to do nothing at all in terms of checking the status of the booleans that relate to a given set, as implied by the description. Instead, the code appears to be deleting elements from the @sets AoA when the nested (inner) array happens to contain no zero values. Now that I've seen the updated version of the OP, it all makes more sense -- thanks!

Another question about the description: is it essential that elements of @sets be deleted, or would it suffice to keep the array a constant size, and "undef" elements when appropriate?

If you can avoid using splice (and if I understand the goal, which is still in doubt), you could build an AoH map of where the various @bools indexes occur in @sets, and only scan the relevant sets as you alter the values of the bools. Something like this:

my @sets = ... # after values are loaded into the @sets AoA, build a map: my @boolmap; for my $set ( 0 .. $#sets ) { $boolmap[ $_ ]{ $set } = undef for ( @{ $sets[ $set ] } ); } # now, as changes are made to bools, check just the relevant sets: for my $next ( @someOrder ) { $bools[ $next ]++; for my $set ( keys %{ $boolmap[ $next ] } ) { if( defined( $sets[ $set ] ) and grep( { $bools[ $_ ] } @{ $sets[ $set ] } ) == @{ $sets[ $ +set ] } ) { $sets[ $set ] = undef; } } }
(But I feel like I'm not really understanding your goal, because I'm pretty sure that you of all people would have figured this out.)

Replies are listed 'Best First'.
Re^2: Better algorithm or data structure?
by BrowserUk (Patriarch) on Aug 30, 2010 at 00:26 UTC
    is it essential that elements of @sets be deleted, or would it suffice to keep the array a constant size, and "undef" elements when appropriate?

    Yes. In use, new sets would be added to @sets on the fly. The mechanism is essentially a queue. New sets ("work items") are being continuously added asynchronously with the resources (represented by the booleans) becoming available. If I don't remove completed work items (sets for which all required resources have become available), then the queue would grow endlessly.

    But your notion about using hashes is a good one. You've used a hash to index those sets that use a boolean against each boolean. You're undefing rather than splicing to avoid screwing up the indexing. But, if I also use a hash for holding the sets, I can delete a set from %sets, without screwing up the index. And that greatly reduces the number of sets that need to be examined for each boolean.

    Thanks. I'll have a go at implementing that tomorrow.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Given this added detail:
      The mechanism is essentially a queue. New sets ("work items") are being continuously added asynchronously with the resources (represented by the booleans) becoming available.

      This poses another problem, because the "boolmap" structure will grow continuously. If the "sets" queue can grow to the point that the process blows up after a while, then the accumulation of hash keys in the "boolmap" structure is going to have the same effect. You'll need a little more code (and more execution time -- but hopefully not too much) to delete "boolmap" hash keys as the corresponding sets are deleted.

      And one more question, which might be unimportant, but the "boolmap" makes it relevant: When adding a new "set" to the queue, what if all the booleans in this set are already true? (Not that I'm curious about that, but it's something the implementation should take into account.)

        the accumulation of hash keys in the "boolmap" structure is going to have the same effect.

        Indeed. But once I've found a set ready for removal from the queue, I have the resource identifiers (formerly boolean indexes), and so I know where in the boolmap to look, but I still need to locate the resource identifiers (formerly set indices), in order to remove them. So now the values within the boolmap have to also become hashes rather than arrays to avoid those linear searches.

        When adding a new "set" to the queue, what if all the booleans in this set are already true?

        The scope for this is limited, but still possible. The solution is that it is relatively easy to check whether there are any outstanding resources for a given set. If not, it can be processed straight away and doesn't need to be added to the queue.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.