Running your routine on 7 files of 200,000 lines apiece (with limit = 1000), takes just 10.5 seconds; and on 100x 200,000 lines takes 145 seconds on my machine.
Showing (as expected) that the runtime is pretty linear with respect to the number of files.
Which make your figures (of 40s for 7 and 1500s for 100) suggest that the majority of time is being spent outside of this routine doing something non linear.
In reply to Re: Optimizing I/O intensive subroutine
by BrowserUk
in thread Optimizing I/O intensive subroutine
by hperange
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |