Critical sections only affect a single process.
Threads not attempting enter the critical section are not blocked.
Only threads of the same process will ever be blocked, then only if they try to enter the same critical section as has already been acquired by another thread (from the same process), running concurrently on another cpu, as no other thread can run on this cpu until the critical section is exited.
Blocking preemption of a thread sounds like it could have a noticeable effect on other processes within the system, but, at worst--ie. if the thread entered the critical section just before it was about to be preempted--, it would only be extending that threads timeslice for a few 100 or so clock cycles.
On average, the critical section would be entered at the beginning, or in the middle of the timeslice, and possible several times, rather than at the end, and the minuscule extension to the timeslice when it did occur at the end would be negligible.
Swapping to use a PAGE_GUARD/STATUS_GUARD_PAGE exception, rather than PAGE_READONLY/ACCESS_VIOLATION would allow both reads and writes to the shared variables to trigger, and also remove one step from the 8 I listed, as the PAGE_GUARD Attribute is cleared automatically.
The structured exception mechanism is on a per thread basis. Naively, similar to throw/catch.
I see no reason why any privilege changes would occur?
It would work on multi-core/processor systems (if it worked at all!).
Now the $a = $a + 1; scenario (and all the other like it. The opcodes (perl5) representing that operation look like
BINOP (0x191da14) sassign BINOP (0x191da38) add [3] UNOP (0x191da7c) null [15] PADOP (0x191da9c) gvsv GV (0x1824348) *a SVOP (0x191da5c) const [4] IV (0x22509c) 1 UNOP (0x191dabc) null [15] PADOP (0x191dadc) gvsv GV (0x1824348) *a
If the critical section (they cost almost nothing (no transfer to kernel mode) to enter if they are unentered by other threads of the same process) are entered at the appropriate point in the opcode tree, in the above case, the sassign, then all the descendant opcodes are also completed within the auspices of the critical section, and so both gvsvs on *a are atomic.
The exception handling mechanism (~throw) would be wrapped around each (mutating) opcode, and the handler (catch) would be specific to that opcode. Normal operations on non-shared variables never throw the exception so cost nothing extra.
The critical section would be entered through the catch code, only when exceptions occurs.
This implies that effectively all accesses to all shared data are serialised across all threads within the process. However, since each critical section will aways be entered and exited within a single (possible marginally extended) timeslice, the longest any thread would have to wait to enter a critical section is 1 timeslice * the number of threads in the process. This longest delay(*) would only occur if all threads, all tried to do a mutating opcode on shared data, on a separate processor, concurrently. The average case would be much, much less.
Does it go anywhere?
(*) The longest wait calculation is not strictly correct. Threads are non-deterministic, therefore there is no guarentee that a thread blocked waiting for entry to a critical section, would receive that entry before another thread that had previously also be waiting, and having already received a timeslice, managed to reeneter the wait state and again be selected.
But this is true for all wait states and in practice does not cause a thread to be starved of cpu over the long term (1 or 2 seconds). Besides which, many if not all OSs have mechanism that temporarially, for one timeslice, boosts the priority of a thread that isn't getting cpu, causing it to get preferential selection on the next round robin.
Also, other processes threads are more likely to be selected than another thread from this process. This is the norm. The time spent in the trheads of other processes (on other cpus as this one is prevented from being preempted if its in a critical section), count to reduce the likelyhood that another thread from this process will be able to run concurrently on that other cpu, and so it serves to reduce the likelyhood that another thread will be blocked from running by this thread holding the critical section.
This stuff is all pretty well tuned, so ignoring the effects of the non-determinism in such calculations is the norm. Over time, the randomness cancels out and the average situation prevails.
In reply to Re^2: A faster?, safer, user transparent, shared variable "locking" mechanism.
by BrowserUk
in thread A faster?, safer, user transparent, shared variable "locking" mechanism.
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |