Can anyone make sense of the following extract from a paper(pdf):
Compare and swap (casx): This instruction swaps the contents of one memory position allocated in the L2 data cache with the value of a register. This means that this instruction always accesses a memory location in L2 cache.
How (On Solaris, possibly only in assembler?) do you allocated a piece of memory such that "This means that this instruction always accesses a memory location in L2 cache.".
That is:
There is no further explanation of this in the paper. They do however mention a pointer chasing arrangement to produce consistent L2 cache misses, which makes sense and suggests they know what they are talking about.
Their very purpose in using the instruction is to benefit from the low-impact high-latency of an L1 cache miss. They further go on to say:
Its latency is 39 cycles in T1 and between 20 and 30 cycles in T2 (in our experiments it takes almost always about 30 cycles). This instruction does not excessively stress the processor structures that could be used by the active thread. In fact, casx only uses one entry of the shared LSU structure that connects the core to the interconnection network. Moreover, the memory space requirements of using this instruction are very low since all the spin-locks can access the same memory position.
Which makes it unlikely that the above is a slip of their tongues or otherwise a misinterpretation of their meaning.
I'm trying to work out how to apply their work on a Intel processor. The Perl link is another attempt at trying to make efficient shared memory available to from Perl.
In reply to OT: Solaris expertise? by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |