You employ a class that provides a hash-like interface designed for multi-threaded operation. A part of that interface is the provision of iterators. Whilst one thread is using an iterator, other threads may be modifying the data structure. Modifications might include any of:

  1. Modifying existing values;
  2. Adding new key/value pairs;

    Which in-turn might cause a resize operation.

  3. Deleting existing key/value pairs;
  4. Explicitly clearing all the key/value pairs.

    As opposed to implicitly because of going out of scope.

It obviously becomes necessary to invalidate existing iterators when some of the modifications occur. But is it necessary for all of them?

For example, with case A), if the only modification is the updating of a value; any existing iterator has either: a) already traversed that pair and retrieved the old value; or b) will traverse it sometime and get the new one. In other words, it will get a value which happens to be the current value at that moment in time and provided that the threads are properly decoupled (ie. not timing dependent) correct algorithms should be fine in either case.

The other cases are less clear cut. One school of thought is that if a deletion or insertion occurs behind an existing iterator; then the affected key/value pair has already been processed, so it doesn't matter; and if it happens ahead of the existing iterator they'll get processes once the iterator gets there.

The problem is that the nature of hash structures is such that an insertion or deletion either ahead or behind could affect the structure (eg. cause a rehash or resize) such that the iterator might now revisit pairs already processed and/or skip keys that haven't yet been. And what if the deletion deletes what would be the next pair to be returned. If the iterator returned undef -- what else could it do? -- then that would be confused with the normal iterator completed return.

If another thread completely empties the hash; then it could be viewed that having existing iterators simply end is the right thing to do. But of course, for some algorithms that could result in false statistics, data or conclusions.

So, the questions rattling around my brain are:

I've been thinking about this for a while and I'm beginning to suspect that there is no one right answer to it. What I'm looking for is some kind of feel for what might constitute a 'most cases' right answer.

Thoughts?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
any

In reply to Design thoughts: iterator invalidation by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.