One bad thing about line-by-line in this users case though is that it will be much slower as he is reading these files over the network and the backend in windows will be way more effeciant if he pulls the whole file at once.
Do you have evidence to support this? My experience says the opposite. For one, reading the file in slurp mode doesn't save substantial network traffic over reading it line-at-a-time, since disk pages are read and buffered to support per-line access. For another, assuming the pattern you're trying to match occurs once and is distributed randomly through the target file, on average you'll only need to read half the file to match it.
| [reply] |
I have ran into a few projects using C where mmapped files over network mounts on windows were dropped in favor of a full read of the file in order to get the low level windows networking code to burst the file. In this case though your point may be true, I guess if the match happens randomly in the file there would be no need to have the whole file transfered across the network. I my cases I need access to the whole file every time. A good example between memmapped file access and a full open/read triggering the burst mode is simple though, copy with explorer almost always triggers the burst mode -- try installing ms office across a network drive ( the installer mmaps the cabs) time it, then time copying the files across and installing. Or a perl only test is a slurp and dump local vs line by line dump to local on a large text file. dws++ for bringing up a point I completly missed though.
Edited: Also as far as I know the bursting mode does not work on samba servers as far as I know.
-Waswas
| [reply] |
| [reply] |
I may be out of date, but doesnt perl use its PerlIO layer wich just indirectly uses fseek,fwrite and ftell ad nauseum to mem map files for line-by-line? Last time I looked I dont think I saw that it buffered the entire file, which is what you need to do (ie read the entire file in one swoop) to get windows busrting mode to kick in.
stdio
Layer which calls fread, fwrite and fseek/ftell etc. Note that as this is "real" stdio it will ignore any layers beneath it and got straight to the operating system via the C library as usual.
perlio
This is a re-implementation of "stdio-like" buffering written as a PerlIO "layer". As such it will call whatever layer is below it for its operations.
-Waswas
| [reply] |
| [reply] |