in reply to mmaping a large file
ap_file is not supposed to load the whole file in memory, is it?
Yes it does, if you access the entire file. It just does so lazily, on demand rather than all at once when you first 'read' it.
That is to say, when you first map a file, none of its contents are actually loaded from disk. A chunk of your process' virtual address space -- the size of the file -- is reserved and the mapping call returns very quickly. Now, when you attempt to access bits of the file, the 4096-byte page(s) containing the bit you access, will be loaded from disk on-demand (via page fault(s)).
If you have a large dataset in a file: and a) only need access to small bits of it in any given run; b) you can find those bits without reading through the whole file from the beginning; then mapping can be an effective way of minimising the number of pages read from disk.
But, if all you are going to do with the mapped file, is to read it serially from beginning to end, you're better off using normal file IO which doesn't cause page faults, and can read the entire file (serially) through a small amount of memory. (Eg. line by line through one or two page sized buffers.).
Memory mapping also requires that you have sufficient virtual address space in your process in order to hold the amount of the file you need concurrent access to. For 32-bit processes, that means files > 2GB require the programmer to re-map them in order to access the whole file.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: mmaping a large file
by grondilu (Friar) on Aug 24, 2012 at 00:10 UTC | |
by BrowserUk (Patriarch) on Aug 24, 2012 at 00:26 UTC |