I'm a bit surprized that none of the answers so far have mentioned or asked about Devel::DProf. So: have you tried to run your script under the profiler ?
In my experience, it might sometimes come as a surprize to see where most of the time is spent.
The bottleneck's largely IO. If it were up to me, I would start by running a simple test with the time utility to see how much of the execution time went to executing the program versus waiting on IO and then decide whether it's worth profiling the code.