|Pathologically Eclectic Rubbish Lister|
A very good point ... (upvoted ... [P.S. believe it or not I don’t downvote you]). But also, please to note, another example of “intentional over-allocation,” this time the resource being “processor cores.” The very-reasonable gamble that most CPUs are nowhere near “100% busy.”
The hypervisors in your hardware-rich environment can “move the extra work to other servers.” In other production environments, the situation could be as hardware-generous as yours, or, there may well be no other server to move to. (The farm in question is not a “truly symmetrical multiprocessing architecture.”) The designer had better know which is the case. It is safe only to assume that no one could get “the benefit of those overcommitted cores” if suddenly the entities to which those resources had been (over-)committed actually tried to use them as though they were real. A highly CPU-intensive task, e.g. a 3D render or somesuch, that tried to spread itself upon the “32 cores” that it thought that it had, would ... as you well put it, “cost dear.”
We take plenty of advantage of the fact, usually true, that the CPU is so vastly faster than any of the mechanical devices that it is attached to, that we can virtualize that resource both on the CPU and the Hypervisor levels. And the same is generally true of RAM: most of the time, the virtual-memory size is quite large but the working-set size remains quite small but with the occasional peaks.
If you tried to do a 32-way 3D render on that box, and especially without lots of other blades to fall-over onto, it would certainly be “dear.” And exactly the same thing holds true for RAM: either you have it, or you don’t. If you don’t really have it ... “dear, dear.”
The usual performance-graph illustration for “thrashing,” of any sort, is a line that suddenly goes exponentially straight-up ... “hitting the wall.™”
The real bugaboo about memory, even more than VMs, is that (when paging starts in earnest), the entire cause of the paging is an indirect consequence of whatever the program is doing. You have only a “probability” that any given memory access will result in a page-fault. You can only tilt that probability in your favor through building algorithms that exhibit locality-of-reference; that do not view “all of the memory that I think I have” as something that you really do have, and that they therefore can expect to access without any sort of time-penalty. VMs are different from that because here the resource that we are splitting is CPUs and their cores, which have entirely different operational characteristics. But the principle is the same. In the end, you have only so-many physical resources at your disposal, all virtualized in an effort to make more-productive use of them than any one purpose might fully utilize. All’s well until and unless you truly exceed them. Then, “dear disaster.” Unless you have another server to spirit the load onto (lucky you ...), there’s nowhere to go but back to the drawing board.
In reply to Re^5: RAM: It isn't free . . . (over-commit allocation -v- actual use)