Re: File reading efficiency and other surly remarks

I think the most important difference between slurping an entire file and reading line-by-line, is that it is pretty easy to have a file so large that slurping it will not just be slower than reading line-by-line, it will fail.

Of somewhat lesser concern is that the line-by-line method scales linearly while the slurp method will start to slow down rather dramatically as the files start to get too large.

So if you have an operation that can be done fairly efficiently in a line-by-line manner, I think you should almost always do it that way.

If you are doing a small file, then the speed-up of slurp mode probably just isn't enough to make much difference. If you are doing large files, then you can't risk the possible huge slow down or out right failure.

If you are doing large files and need as much speed as possible, then you often read and write files chunk-by-chunk (which has the extra advantage of working even when you have a file containing a single line that is too large to fit in your available virtual memory space). This requires the use of read() or sysread() and possibly syswrite(). (A "chunk" is usually a fixed-length and fairly large buffer, like 64K).

But this gets complicated quickly. No simple answer is going to cover all cases.

- tye (but my friends call me "Tye")

Comment on Re: File reading efficiency and other surly remarks