Thank-you for the clarification of terminology and the description of benchmarking practice. I'm interested in how OS-competition related variance affects both profiling and benchmarking.
The number of times function X is called is, of course, stable from profile run to profile run, but at least in my experience with profiling the ranking of function calls according to time can vary greatly from run to run. For example, in one run a function that was called ~5000 times
clocked at 0.016s, ranked 3rd and consumed 10.8% of the time. In another run using the same data, that same function clocked at 0.003s, ranked 4th, and consumed 4.88% of the time. A function that consumes 11% of the time is a potential bottleneck, 5% of time, I'm not so sure.
Best, beth