in reply to Re: Re: Memory allocation strategies
in thread Memory allocation strategies
The good thing about the doubling strategy is it keeps the number of reallocations to a minimum. The bad things are
I think I would use a different strategy. I would modify Tie::Array to (hold your breath:) use an array! Not for the individual elements, but for an array of dynamically grown but fixed maximum size 'buckets'. With array indices, working out which bucket to use to retrieve any individual element of the actual array is simple math.
Preallocate the first bucket to say 16k minimum and use the doubling strategy to reallocate it up to a maximum of say 64k or 128k. Then, allocate a second bucket and follow the same strategy. You can also preallocate the size of the array to some reasonable starting point.
This strategy minimises both the total memory allocation when the array is fully completed by keeping the wasted space to a maximum of your bucket size. It also keeps the amount of memory that needs to be copied each time reallocation occurs to that half of that same value. It also minimizes the 'perl array' overhead.
The values of 16k and 64k are just examples. Choosing the right initial and maximum sizes for your buckets will depend very much on your rate of growth etc. One interesting number is 180224. This is 44 * 4096. Using this as your maximum bucket size means that you would prevent any wastage within each bucket and simplify the math as 44 fits in an exact number of times. You could use the doubling strategy starting with an initial allocation of 11* 4096 = 45056; and double until you reach 44 * 4096 = 180224. This keeps the maximum wastage at completion reasonable and allows an integral number of your 44 byte blocks to fit at each stage.
Using 44 * 4096 as the maximum bucket size to store 100 MB of data would use a bucket array of 582 elements (buckets). Preallocating a 582 element array of null scalars uses around 132k gving an overhead of just 0.13% which doesn't seem to bad.
The reason for using 4096 is because Win32 tends to round up memory commits to the nearest page size which is 4096 on intel. However, it is quite likely that perl adds some overhead bytes to each allocation beyond those of the user data-structures in order to manage its internal heaps. It is also quite likely that if your perl is calling the compiler runtime to alloc its heaps, that they are also adding a few bytes to each allocation for their internal use.
Overall, working out the actual overheads on any given allocation given the documented overhead of perls scalars and array chains, the undocumented overhead of perls and the runtimes heap management is pretty impossible, so using the 4k OS pagesize as the basic unit seems as good a way to go as any. On other OSs and platforms this would probably be different.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Re: Re: Memory allocation strategies
by demerphq (Chancellor) on Sep 02, 2003 at 10:37 UTC | |
by BrowserUk (Patriarch) on Sep 02, 2003 at 11:37 UTC |