Re: Design thoughts: iterator invalidation

I suppose that my thoughts are [completely ...] caught off-guard by the notion of “iterating through” a hash, which I consider to be a stictly unordered structure. I would therefore consider “the keys” to be strictly a set, not a sequence. Iterators, therefore, would eventually return all of the keys, but would not be guaranteed to do so in any particular order. There could be no concept of a “repeatable read.”

Edit: You describe this as a hash-like structure. If it is not, in fact, entirely based upon Perl5 “hashes,” but is instead some kind of truly awesome contrivance for storing (key, value) pairs ... which in your case would not surprise me at all ... then these comments might not apply in the slightest. I know this. As do you.

In the “simplistic” case of “real Perl5 hashes,” I would suggest merely letting the iterators run as they now do. Don’t try to do anything to remedy the situation because, I think, you really can’t. If an iterator is running over a volatile data structure, it might encounter some keys more than once and/or encounter keys not-at-all. If it is traversing a section of the data structure that has vanished beneath its feet, it has nothing that it can do but to return undef, and the threads must live with what they’ve got. (If it would be meaningful for them to know the difference, I would handle this by throwing an exception instead of returning undef.)

If what you actually need is for the threads to, say, start-over when an insertion or a deletion occurs, maybe the most efficient mechanism would be to send the threads a signal, which would cause each of them to set some kind of a flag that causes them to break out of their current iterator loop, reset the flag, acquire a new iterator, and start-over using it. (Knowing that any of them could have already encountered a problem.)

Basically, to me, it comes down to this question (the answer to which, I cannot know): how tolerant are the threads to the possibility that they might be served-up the same key twice, and/or that they might not be served up all of the keys that existed? How tolerant can they be of the possibility that a particular use of the iterator might be drastically shortened? Or, to put it another way but perhaps a much simpler way: how tolerant can they be, of the possibility that the iterator’s outcome might be (meaningfully) wrong, when weighed against the probability that such an event might occur [undetected ...] during any particular thread’s use of an iterator? (And, how much wall-time does any particular thread’s complete use of an iterator actually take?)

M-a-y-b-e you could try this approach: maintain a counter that is incremented every time an insert/delete event occurs, and have each thread capture that value before and then after they use an iterator. If, at the completion of an iterator loop (and regardless of when or how it ended ...), the thread found that this value had changed, it knows that the results are suspect: perhaps the thread throws-away its results and tries again. Otherwise, it knows that the results it just obtained are valid, and it records them. (Variations on this theme, in which the threads check “periodically” at some agreed-upon interval to reduce lock-contention time, are obvious.)
Here, you are specifically “checking for what-it-is that you are afraid of, optimistically hoping that you will find that it has not occurred, instead of pessimistically assuming that it ~~did~~ might. “It’s cheaper/easier/faster to get forgiveness than permission.” Instead of designing the iterator-pool to compensate for threads who are obliged to know nothing ... and to “compensate” in a way that just might wind up racing with all of them ... you instead take the approach of always hoping that the threads will run the gauntlet successfully.
In a high-performance situation, I rather like this alternative . . .

Replies are listed 'Best First'.
Re^2: Design thoughts: iterator invalidation by BrowserUk (Patriarch) on Feb 23, 2015 at 03:46 UTC
You sad ^&^%&&%^. You simply cannot help yourself can you. 'Iteration' does not imply 'order'. Even an inherently ordered array can be iterated out-of-order: `print $a[ $_ ] for shuffle 0 .. $#a;` You've obviously never even used a Perl hash. Why am I not surprised. I seriously doubt if you have ever written a single, complete, working program in Perl. (I'm seriously beginning to doubt if you've ever done so in any language.) Which just reinforces the question as to why you feel the need to continue hanging around this place; let alone sticking your confused and muddled oar in all the time. The rest is just regurgitated-out-of-context, meaningless garbage. Please stop posting. (In my threads at the very least!) With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.