In addition to already mentioned turning off threads/debugging, there could be a reason to try building with perl own memory allocator.
This immediately brings binary incompatibility, but you can gain dramatic speed improvements on some OSes (PocketPC is one, and win32 is also; I have not tried comparing this elsewhere)
BR vkon
Comment on Re: Perl with internal Benchmark/Profiler