in reply to buffering from a large file

I have quite a large file, appx. 4 GB. I am running on a cluster, so memory is not an issue. So, I went ahead and buffered the whole file on to a variable. I hope this is much better than reading line by line?

Probably not. Discarding 3 lines will likely be far less costly than allocating memory to hold them when you are not going to use them.

Now, since mostly these operations are independent, that is; the pattern check on line 2 ($i = 1) could be done independent of line 6 ($i=5) and so on.. is it possible to create something like threads or do multiple checks at the same time?

It would certainly be possible to use a separate thread to process each selected line, but starting a new thread to process a single line--unless the processing of that line is very cpu-intensive--is unlikely to save any time.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.