You employ a class that provides a hash-like interface designed for multi-threaded operation. A part of that interface is the provision of iterators. Whilst one thread is using an iterator, other threads may be modifying the data structure. Modifications might include any of:
Which in-turn might cause a resize operation.
As opposed to implicitly because of going out of scope.
It obviously becomes necessary to invalidate existing iterators when some of the modifications occur. But is it necessary for all of them?
For example, with case A), if the only modification is the updating of a value; any existing iterator has either: a) already traversed that pair and retrieved the old value; or b) will traverse it sometime and get the new one. In other words, it will get a value which happens to be the current value at that moment in time and provided that the threads are properly decoupled (ie. not timing dependent) correct algorithms should be fine in either case.
The other cases are less clear cut. One school of thought is that if a deletion or insertion occurs behind an existing iterator; then the affected key/value pair has already been processed, so it doesn't matter; and if it happens ahead of the existing iterator they'll get processes once the iterator gets there.
The problem is that the nature of hash structures is such that an insertion or deletion either ahead or behind could affect the structure (eg. cause a rehash or resize) such that the iterator might now revisit pairs already processed and/or skip keys that haven't yet been. And what if the deletion deletes what would be the next pair to be returned. If the iterator returned undef -- what else could it do? -- then that would be confused with the normal iterator completed return.
If another thread completely empties the hash; then it could be viewed that having existing iterators simply end is the right thing to do. But of course, for some algorithms that could result in false statistics, data or conclusions.
So, the questions rattling around my brain are:
I've been thinking about this for a while and I'm beginning to suspect that there is no one right answer to it. What I'm looking for is some kind of feel for what might constitute a 'most cases' right answer.
Thoughts?
In reply to Design thoughts: iterator invalidation by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |