We use a lot of real time perl scripting in an envoronment with high IOPS. Accumulated load time lag becomes noticeable.
What is the amount of time needed for the stat syscalls compared to the time needed for reading and compiling a single module file? I would answer that question first. To me, without having proof, it looks like micro-optimization at the wrong place. Making stuff resident in memory to avoid startup overhead and/or have persistent processes is imho a better optimization. Of course, depends on what you are doing.
Another way to avoid overhead is e.g. App::FatPacker - lump all together and you have a single file to stat, and btw avoid all those open/close syscalls.
perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'