in reply to buffering from a large file
I have quite a large file, appx. 4 GB. I am running on a cluster, so memory is not an issue. So, I went ahead and buffered the whole file on to a variable. I hope this is much better than reading line by line?
Probably not. Discarding 3 lines will likely be far less costly than allocating memory to hold them when you are not going to use them.
Now, since mostly these operations are independent, that is; the pattern check on line 2 ($i = 1) could be done independent of line 6 ($i=5) and so on.. is it possible to create something like threads or do multiple checks at the same time?
It would certainly be possible to use a separate thread to process each selected line, but starting a new thread to process a single line--unless the processing of that line is very cpu-intensive--is unlikely to save any time.
|
|---|