You're right. I am timing entire programs.
But I am timing one program running enough iterations
of the match to take 10 seconds worth of CPU time,
up to a max of 10,000 iterations.
If you assume Perl starts in less than 10ms (it loads in well under that on my machine), then the extra time added by startup is no more than 1 microsecond in the final numbers.
Also, the point was really the difference in growth rates: the 60+ seconds for backtracking isn't being spent during program load. (And PCRE is a C program too.)