Re: Completeness of Devel::Size answers

Replies are listed 'Best First'.
Re^2: Completeness of Devel::Size answers by gone2015 (Deacon) on Sep 23, 2008 at 00:35 UTC
OK... If I run the code with `Devel::Size::size` it says the 1 million entry hash requires 94.8 M. If I run with the calls to `Devel::Size::size` commented out, the VM still grows from ~100M to ~270M -- nearly twice what's expected :-( For the full 5 million, VM grows to 1.0G, so the hash appears to need 900M -- which is a bit more than 188 bytes per entry... where `Devel::Size::size` suggested 100 ?? Mind you, I am assuming that all the increase in VM size is down to the hash -- which I think it is, because if I comment out the assignment to the hash, the VM doesn't change size. Thanks.	[reply] [d/l] [select]
Re^3: Completeness of Devel::Size answers by BrowserUk (Patriarch) on Sep 23, 2008 at 01:07 UTC
Mind you, I am assuming that all the increase in VM size is down to the hash If you could inspect the memory, you would probably find that a substantial portion of the total memory allocated to the process is actually free for re-use once the hash has been built. This is because as the hash grows in size, as it fills towards the maximum capacity of it's current size, it reaches a point where the Perl decides it is time to double the number of buckets. In effect, a new hash is allocated that has twice the capacity of the old one and the contents of the old one are copied into the new before any new keys are added. Once the copy is complete, the memory for the old copy is freed back to the process pool (but not the system) for reuse. So whilst it may take (say) 600MB to hold the completely built hash, there is a brief point in time where two copies of the hash so far are required. If you have the ability to monitor the process memory usage as the hash is built, you'll see the graph of the memory allocated versus time looks something like this: `_________________\| \| \| \| \| \| \| \| ________\| \| \| \| ____\| \| __\| \|` [download] Where both the vertical jumps (memory allocated) and horizontal jumps (time between reallocs) doubles. It is an effective strategy for performance, but can mean that a substantial portion of the memory allocated to the last but one incarnation of the hash doesn't get re-used (by the hash). However, that memory is available for use by the rest of your program. One thing that may allow you to reduce the overall memory consumption of your program, is to ensure that the big hash gets built first. For example, you suggest that you've already loaded/generated 100MB of data in memory before you build this hash. If you delayed the creation/loading of that data until after the hash is constructed, you may find that there is enough memory, (allocated during the construction of the hash, but freed back to the pool before it is complete), to accomodate most or all of that 100MB without the need to request more from the OS. If you're banging your head against the limits of your machine, that may just be enough to stop you going into swapping. Worth a try at least if your program logic can accommodate that change. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l]
Re^4: Completeness of Devel::Size answers by gone2015 (Deacon) on Sep 23, 2008 at 18:12 UTC
If you have the ability to monitor the process memory usage as the hash is built, you'll see the graph of the memory allocated versus time looks something like... I've dug a bit deeper. I got the hash building to stop each time the number of buckets increased. This is what I saw: : Entries: Used/Buckts : Expect : Heap : Aux : :--------:-------/-------:--------:--------:--------: : 0: 0/ 0 : 0.0M : 0.9M : : : 10: 9/ 10 : 0.0M : " : : : 20: E/ 20 : 0.0M : " : : : 40: 1B/ 40 : 0.0M : " : : : 70: 38/ 80 : 0.0M : " : : : 130: 67/ 100 : 0.0M : " : : : 260: CD/ 200 : 0.0M : " : : : 550: 1A8/ 400 : 0.1M : 1.0M : : : 1100: 368/ 800 : 0.1M : 1.2M : : : 2100: 66D/ 1000 : 0.2M : 1.3M : : : 4100: C9B/ 2000 : 0.4M : 1.7M : : : 8200: 1913/ 4000 : 0.8M : 2.2M : 0.3M : : 16400: 327F/ 8000 : 1.7M : 3.4M : 0.8M : : 32800: 64D5/ 10000 : 3.3M : 5.6M : 1.8M : : 65600: CA94/ 20000 : 6.7M : 10.1M : 3.8M : : 131100: 1940F/ 40000 : 13.4M : 19.1M : 7.8M : : 262200: 325A8/ 80000 : 26.8M : 37.2M : 15.8M : : 524300: 64A98/100000 : 53.5M : 73.3M : 31.8M : : 1048600: C9691/200000 : 107.0M : 145.6M : 63.8M : : 2097200: 192DE0/400000 : -0.0M : 290.0M : 127.8M : : 4194400: 326095/800000 : -0.0M : 578.9M : 255.8M : : 5000000: 3977F6/800000 : -0.0M : 683.7M : 255.8M : Which shows the effect you described, the hash being rebuilt with more buckets as it grows, and the heap growing with it. (The `Used/Buckts` is given in hex.) This table also shows the expected size of the hash, according to `Devel::Size::total_size()` on a separate run -- where it says -0.0M I don't have a value, because life is too short. The figures for actual memory use are given by the System Monitor. The `Heap` is present when Perl starts. The `Aux` appears as the hash grows -- FWIW, I observe that it's at the far end of VM space, below the `Stack`. NB: the actual memory use is taken when my test is run with the `Devel::Size::total_size()` calls commented OUT -- so the figures are not affected by its overheads. So, assuming that the hash is created on the heap, on the face of it, either `Devel::Size::total_size()` is under-reporting, or the construction of the hash is leaving free space in the heap. Anyway, the discrepancy is ~38 bytes per entry or ~19 bytes per bucket. (I assume that `Devel::Size::total_size()` does take into account the buckets overhead ?) I tried reducing the key size from 33 to 17. I got the same discrepancy. The other puzzle is what the `Aux` is. It appears to be something to do with the rebuilding of the hash. The size is independent of the key size. It's 32 bytes per new number of buckets, or 64 bytes per old number of buckets ! In any case, it's a significant chunk of memory (virtual or otherwise) :-(	[reply] [d/l] [select]
Re^5: Completeness of Devel::Size answers by BrowserUk (Patriarch) on Sep 23, 2008 at 18:55 UTC
Re^2: Completeness of Devel::Size answers by jand (Friar) on Sep 23, 2008 at 00:09 UTC
You can detect circles without all the bookkeeping overhead, but you wouldn't be able to calculate the total size of a structure containing circular references (you may end up counting some elements multiple times, and you can't be sure you traversed the whole structure before you detected a circle).	[reply]
Re^3: Completeness of Devel::Size answers by BrowserUk (Patriarch) on Sep 23, 2008 at 00:35 UTC
You can detect circles without all the bookkeeping overhead, Interesting. How? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^4: Completeness of Devel::Size answers (DAG) by tye (Sage) on Sep 23, 2008 at 05:49 UTC
To detect cycles, you need only keep a hash as large as your search is deep. That even allows you to detect cycles "immediately", but it doesn't tell you when you have the same structure referenced from two different places in your directed, acyclic graph: `my @a= ( \%huge, \%huge );` - tye	[reply] [d/l]
Re^4: Completeness of Devel::Size answers by gone2015 (Deacon) on Sep 23, 2008 at 10:35 UTC
FWIW, I was stunned when I came across the following: To detect cycles you can use two pointers -- say, p and q. You step p through the data in the usual way. You step q through the data twice for each step of p, until the end is met. If p equals q before the end, you have a loop. The tricky bit in a recursive descent tree walk is to walk two pointers at once ! You need to build an iterator for q which contains at least a stack for recursion, and, for Perl, some way of iterating through a hash which isn't going to trip up p. (Of course, you have to guarantee that p and q will visit the nodes in the same order, especially where there is a loop !) Apart from that minor difficulty, this trick only tells you there is a cycle. It doesn't tell you where it is. "It's a good sort of brake. But it hasn't worked yet."	[reply]
Re^5: Completeness of Devel::Size answers by BrowserUk (Patriarch) on Sep 23, 2008 at 12:17 UTC
Re^5: Completeness of Devel::Size answers by moritz (Cardinal) on Sep 23, 2008 at 11:26 UTC
Re^5: Completeness of Devel::Size answers by Perlbotics (Archbishop) on Sep 23, 2008 at 19:10 UTC
Re^4: Completeness of Devel::Size answers by tilly (Archbishop) on Sep 23, 2008 at 16:00 UTC
If you have a circle, then any value in that circle will repeat infinitely often. So at depths 2, 4, 8, and so on you switch to looking for the current element at deeper depths. When you back up, you stop looking for a match. At any point in time you only need to remember your current depth and possibly the one element that you're looking for duplicates of. This is guaranteed to find any circular references. It just won't always find them promptly, and so in a total size count you might count elements more than once.	[reply]
Re^2: Completeness of Devel::Size answers by gone2015 (Deacon) on Sep 23, 2008 at 10:13 UTC
The problem is that Devel::Size itself uses a substantial amount of memory. Internally, it builds a hash of all the SVs (scalars) it encounters whilst traversing the stucture being sized. It does has to do this in order to detect circular references that would otherwise cause it never to return. This strikes me as pessimistic. With two passes, the first could collect all references, and avoid following ~~repears~~ repeats -- avoiding loops and avoiding counting things twice. The second pass could generate the size count. Actually, AFAIKS, the only reason you need two passes is because with ref:SCALAR you cannot otherwise tell that whether the SCALAR is internal or external, and, if internal, whether it's already been counted. My guess is that ref:SCALAR is relatively rare... ... perhaps the trick is to proceed optimistically, constructing the size during pass 1. At the end, if there are any ref:SCALAR entries, need to run a half-pass to rescan the structure, looking for any referred to internal SCALARs, up to where they are first referred to. ... I can see that for some structures one and a half passes would be more expensive than the current approach. However, at least the penalty is paid where things are difficult. The current stuff falls over if the data is large, however simple. Rats. Something in the back of my mind is telling me that string value SVs may implicitly refer to the string... But then there would be a ref-count > 1... which could be picked up in Pass 1...	[reply]