Re: Parsing Large Text Files For Performance

Replies are listed 'Best First'.
Re^2: Parsing Large Text Files For Performance by BrowserUk (Patriarch) on Feb 01, 2011 at 00:35 UTC
In general, reading a file using fixed-sized blocks is somewhat faster than line by line. Especially if the fixed-size is chosen to coincide with the 'natural' read size of the filesystem of the device holding the file. This is easily explained by the fact that when reading line by line, the runtime first reads a block and then has to scan that block looking for the end of line character before transferring the appropriate number of bytes to another buffer for return to the calling program. But for your application where you want individual lines contain your matching terms, if you read the file block-wise, you'd then have to break up the block by searching for newlines either side of the matches anyway, so the net result would be the same amount of work. But searching for line ends in Perl will usually be slower than letting the system do so in C. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^2: Parsing Large Text Files For Performance by GrandFather (Saint) on Feb 01, 2011 at 00:10 UTC
Write a benchmark and test it! There are too many variables that may come in to play for us to give a really useful answer without actually trying it on equivalent hardware to the system you are using. True laziness is hard work	[reply]
Re^3: Parsing Large Text Files For Performance by bigbot (Beadle) on Feb 01, 2011 at 00:22 UTC
I will do that, thanks!	[reply]