comment on

Usage scenario. Most of the time, the producer will be running on one core and the consumer on another, and they will producing and consuming from their respective ends of the shared memory structure as fast as they can go. No locking; no synching; no (elective) context switching.

Occasionally, one end or the other will get preempted for some higher priority thread. At this point, the shared data structure will become either full, or empty depending upon which end is still running. At that point, that end needs to enter a wait state until the other end gets another timeslice, does its thing, relieving the empty or full state and waking up the other end to continue.

Most of the time, given a correctly sized, and well-written buffering data-structure, the above scenario is both lock-free, wait free and requires no system calls (ring3/ring0/ring3 transitions). Both consumer and producer threads are free to run as fast as their processing requirements allow them and utilise their full timeslices. The latter point is the key to maximum utilisation.

If I use suspend/resume, buffer empty/full conditions are guaranteed to not only require a multiple calls into the kernel, but also (at least one) very expensive context switch. If I use cond_vars and (unneeded) kernel mutexs, this also means an expensive call into the kernel for every read & write.

The whole point of lock-free & wait-free algorithms is that they avoid both: expensive calls into the kernel; and expensive elective context switches--ie. non-pre-emptive ceding of the cpu--in order to make full use of each time-slice allotted.

The point of Fast, user-space mutexes is that they run in user-space, and are therefore faster.

The (lock-free/wait-free) algorithms are getting better and better defined. The hardware support (CAS, XCHG and similar SMP atomic instructions) is getting better and better with every new generation of processors.

The limitations are currently locking, syncing and signalling mechanisms designed for single-processor/core IPC purposes. Given that much of the HPC research is done on *nix boxes of one flavour or another, I know there are better mechanisms out there. This thread was meant to be about enlisting help to find them, not argue about whether they are possible, or even required.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^4: OT: Locking and syncing mechanisms available on *nix. by BrowserUk
in thread OT: Locking and syncing mechanisms available on *nix. by BrowserUk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.