Always remember that an operating system is, by design, extremely “lazy.” In other words:
- When your process makes the memory-allocation request, the operating system will first make sure that it can carry out the request without grossly-overcommitting the memory resource. (A certain amount of overcommit is fine.)
- When your process actually does something, to create actual pressure upon a resource, only then will the operating-system begin to take action. If (in a different scenario) you asked for 100MB but only filled 2MB with anything other than zeroes, the actual pressure that you exerted was 2MB, not 100MB.)
- (Your process did exert 100MB of actual pressure, one 4K-page at a time, because it did fill all those bytes with something...)
- When your process releases a resource, once again the resource will be “marked as releaseable,” but the resource will actually remain nearby until, and unless, actual pressure is manifested from somewhere-else that compels the operating system to begin reallocation.
You see, the odds are good that a process's future behavior will be similar to its recent behavior. Processes that are making use of big buffers (and that are not brilliantly written by their designers...) are likely to be grabbing and releasing those big buffers in a loop. Programs that have been run recently are much more likely to be run again soon, than are any that were not. Files that were used recently are the same way. So, there are plenty of extremely good reasons for the operating system to say, “if it's not actually squeaking, or if the squeaking doesn't matter to anyone else right now, don't bother to grease it.”