Whenever I've thought of doing any bechmarking involving file i/o, I've always been stumped as to how I could be sure that filesystem caching didn't mess up the results.
How can I be sure that doing a test X number of times will give results comparable to executing the code once in a real program?
With such extremely different methods of reading I can only imagine that it's even harder to be sure that caching isn't helping / hindering individual tests.