It would be best to do some benchmarking and some profiling of the code before trying to optimize it. Benchmarking involves checking the execution time as a function of file size, to see if these show a linear vs. step function. Profiling involves timing specific blocks of code separately, and ranking them in terms of how much of the execution time is tied up in each block; the blocks that take the most time are the ones you want to focus on when trying to optimize. (Check out Devel::DProf)
Apart from that, some general issues to look at might include:
- If you're holding the full content of a given file in memory, maybe you shouldn't do that, because it makes the process too heavy when the file happens to be really big.
- If you're reading each file multiple times or your're reading, modifying, writing a data set in multiple passes, maybe you need to figure out an approach that only reads a file once or only does one pass over the data (and doesn't try to hold the entire file in memory).
(update: fixed wording and spelling in next-to-list item)