in reply to Re: profiling XS routines
in thread profiling XS routines

A solution from 2000 and onwards (most recent release in 2020, see https://bitbucket.org/icl/papi/wiki/PAPI-Releases) that might help here by showing numbers of cycles spent rather than time per se: https://en.wikipedia.org/wiki/Performance_Application_Programming_Interface.

To instrument code, use functions mentioned in https://stackoverflow.com/questions/49045742/how-to-use-oprofile-to-calculate-execution-time-of-a-part-of-c-program.