Techniques On Saving Memory

Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Techniques On Saving Memory by BrowserUk (Patriarch) on Mar 09, 2005 at 18:55 UTC
Take a look at Devel::Pointer. It has routines for unsmashing smashed references. It works, and I have used it to do something very similar to what you are attempting. One problem is that you also need to store the type of ref (scalar/array/hash/code) along with the smashed address. That screws the otherwise convenient 32-bit / element storage requirement making the math messier than it ought to be. The second is you also need to store unsmashed copies of the references so that perl will not GC the elements to which you are storing smashed refs. That duplication of storage negates much of the benefit, combined with the fact that you are now responsible for managing the memory instead of allowing Perl to take care of it, makes for a heavier and riskier module than might otherwise be the case. I have the germs of an idea for dealing with that also, but the tools are not available to me to implement it (yet). Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply]
Re^2: Techniques On Saving Memory by halley (Prior) on Mar 09, 2005 at 19:05 UTC
you also need to store unsmashed copies of the references so that perl will not GC the elements to which you are storing smashed refs Ooo, ouch. Good point. I think you will find, case after case, that by the time you're done "improving" a perl feature, you're going to have a solution that is as complex (in memory and time and code and shortcomings) as the perl implementation itself. -- `[ e d @ h a l l e y . c c ]`	[reply]
Re^3: Techniques On Saving Memory by BrowserUk (Patriarch) on Mar 09, 2005 at 19:38 UTC
I've been trying to get this idea right for almost two years now. On two occasions I've even reached the point where I thought I was ready to let code loose on the world, but then I discovered something which I wasn't able to handle without duplicating the references in a hash, which completely negates any benefit at all. As I said, I have the germ of a solution, but as I am entirely uncomfortable in using XS, or even XS modules in ways that move beyond their authors intent, it will remain just an idea until I can think of a way of proving it's safety. Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply]
Re^2: Techniques On Saving Memory by Limbic~Region (Chancellor) on Mar 09, 2005 at 19:21 UTC
BrowserUk, The second is you also need to store unsmashed copies of the references so that perl will not GC the elements to which you are storing smashed refs. That duplication of storage negates much of the benefit,... tye was quick to point out an easy solution for the problem was to use a parallel hash as a lookup table. I admitted it was a good straight forward answer but defeated my purpose. On the other hand if an anonymous sub can be made to take up less space than the anonymous array: `sub compact { my $ref = shift; my $pck; # Packed information except reference # purposefully avoiding dispatch table # This keeps us at HoC return sub { if ( $_[0] eq 'data' ) { return $ref }; elsif ( $_[0] eq 'blah' ) { return unpack('A1', $pck) } # where each unpack format gets desired data }; } $hash{foo} = compact( [1..10] ); $hash{foo}->( 'data' )[3] = 42;` [download] The interface could use work and it doesn't really explore any uncharted territory for me. It also is still not a single hash. We are just trading anonymous subs for anonymous arrays. Cheers - L~R	[reply] [d/l]
Re^2: Techniques On Saving Memory by japhy (Canon) on Mar 09, 2005 at 19:19 UTC
My module (now v0.02!) ~~doesn't seem to suffer from the garbage collection problem, and it~~ doesn't require you tell it what kind of reference you expect back. It's not pure Perl (unlike Devel::Pointer::PP), but it is easier to use. (Update: I misinterpreted the garbage collection issue.) _____________________________________________________ Jeff `japhy` Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and `perl` hacker How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart	[reply]
Re^2: Techniques On Saving Memory by Limbic~Region (Chancellor) on Mar 09, 2005 at 21:03 UTC
BrowserUk, Thinking out loud here with only the most elementary understanding of Perl internals - why can't you fake a closure? Basically increase the ref count without actually making the closure. You GC it by decreasing it. As long as you control the creation/modification/deletion of the hash keys and their values, I don't see an obvious problem? Cheers - L~R	[reply]
Re^3: Techniques On Saving Memory by BrowserUk (Patriarch) on Mar 09, 2005 at 21:27 UTC
... why can't you fake a closure?... Sorry, but I don't even begin to see how to fake a closure? And how would you unfake it? As long as you control the creation/modification/deletion of the hash keys and their values, ... What hash? Why use a hash and how would that save memory if you do the Tie::RefHash thing of storing the smashed and unsmashed references? If you do the RefHash thing, you don't need to do any controlling--once you delete the (value) reference, Perl takes care of the rest. But trading a hash for an array is not going to save memory. Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply]
Re^4: Techniques On Saving Memory by Limbic~Region (Chancellor) on Mar 09, 2005 at 21:35 UTC
Re^5: Techniques On Saving Memory by BrowserUk (Patriarch) on Mar 09, 2005 at 22:28 UTC
Some notes below your chosen depth have not been shown here
Re^2: Techniques On Saving Memory by diotalevi (Canon) on Mar 09, 2005 at 23:32 UTC
Instead of Devel::Pointer, see Devel::Pointer::PP which is pure-perl (with the assistance of B) instead of being XS. Its also slightly safer.	[reply]
Re: Techniques On Saving Memory by japhy (Canon) on Mar 09, 2005 at 18:45 UTC
Voila! This was written in about 30 minutes (well, 5 minutes of re-reading the XS docs, 10 minutes of actual coding time, and 15 minutes of writing docs and tests and what-not). (Update: another 10 minutes produced version 0.02, which makes one function do the job of two.) Limbic~Region was asking about such a module on the CB, and when I described what I thought such a module should do, I decided to try writing it. Devel::RecoverRef is a module which takes a stringified or numified version of a reference (objects included!) and returns the reference (or object!) that is represented therein. There's not a lot of protection, though. You'll probably dump core if you make stuff up. Scrutinize, and I'll upload it to CPAN by the end of the week. _____________________________________________________ Jeff `japhy` Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and `perl` hacker How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart	[reply]
Re^2: Techniques On Saving Memory by diotalevi (Canon) on Mar 09, 2005 at 23:41 UTC
Is there some reason you didn't just write this as a wrapper over Devel::Pointer or Devel::Pointer::PP? They already handle the unsmashing part - you just add the part to get the hex address out of strings like "Some::Class=HASH(0x81471c8)". Or why didn't you just ask either Simon or me to update our modules? Its probably a feature that should already be in there.	[reply]
Re^3: Techniques On Saving Memory by japhy (Canon) on Mar 10, 2005 at 01:26 UTC
I hadn't known of those modules until after I wrote mine. They were mentioned in other nodes in this thread. Now that I know they exist, there's no need for me to release mine; you and Simon just need to simplify your modules. _____________________________________________________ Jeff `japhy` Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and `perl` hacker How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart	[reply]
Re^4: Techniques On Saving Memory by BrowserUk (Patriarch) on Mar 10, 2005 at 02:06 UTC
Re^5: Techniques On Saving Memory by japhy (Canon) on Mar 10, 2005 at 15:39 UTC
Some notes below your chosen depth have not been shown here
Re: Techniques On Saving Memory by bart (Canon) on Mar 09, 2005 at 19:16 UTC
I'm not sure what you mean by "reference", but pack/unpack do contain support for two kinds of templates, "P" and "p", that can be used to reference/dereference some block of memory. So if what you want is a (zero-terminated) string or a packed, fixed length structure, you can use one of these. You must be really careful not to move your strings containing the packed data around, or you'll end up dereferencing the wrong memory block. So no changing the length of the scalar holding that block, once you've started packing. I thought of using substr to point inside a string, but it doesn't work; but the following gives some form of workaround: `$x = "one\0two\0"; $b = pack "p", $x; # pointer to "one" $b .= pack 'I', 4 + unpack 'I', $b; # pointer to "two" = substr($x, 4) printf "length: %i\n", length $b; printf "%i %i\n", unpack "II", $b; printf "%s\n", $_ for unpack "pp", $b;` [download] Result: length: 8 2279940 2279944 one two Like I said, `$x` contains the packed data, so don't move that around. For unpacking a fixed length string, you can use `"P12"` for a 12 byte string, for example. Appending the following snippet to the above code, I get the following additional result: `printf "%s\n", $_ for unpack "P2P2", $b;` [download] on tw Note: the unpacked data is a copy of the data, not a reference into the same, original data.	[reply] [d/l] [select]
Re: Techniques On Saving Memory by Roy Johnson (Monsignor) on Mar 09, 2005 at 19:17 UTC
Looking at the problem differently, since you know how big each array is, you can use a hash and one array. In the hash, you store the offset to the beginning of the desired (conceptual) row in the array. `use strict; use warnings; my %hash; my @array; my $row_length = 5; my $next_available = 0; for (qw(cow pig donkey)) { $hash{$_} = $next_available; for my $col (0..($row_length - 1)) { $array[$next_available + $col] = "$_ $col"; } $next_available += $row_length; } # What would have been $hash{pig}[3]: print $array[$hash{pig}+3];` [download] You could even make a class implementing hashes of fixed-size arrays and give it an indexing method. I'm not going to. `:-)` Caution: Contents may have been coded under pressure.	[reply] [d/l]
Re: Techniques On Saving Memory by diotalevi (Canon) on Mar 09, 2005 at 23:44 UTC
Most of the discussion by other people seems to be about how to represent the same data in a smaller space. Here is another idea - instead of storing stringifid references, store them as packed data: `pack( 'N', $str =~ /x([A-F\d]+)/ ? hex( $1 ) : die )`. Your string is now four bytes long instead of at least twice that. You'd unsmash it using some "normal" method. You could also compress your data. I've stored things in memory as six bit characters when I knew I only wanted to use a limited character set and wanted to get the 25% space savings. Similarly with numbers, packing them can yield a shorter binary value than they are when represented as strings. You shoul also consider whether your algorithm can be changed so that less data has to be in memory at any given time. This is a really good reason to check out Dominus new book Higher Order Perl to get some training on using streams and iterators. Iterators are useful tools to use when altering your code so that less stuff is actually present at any given moment. Offloading things to disk is another way to save memory. Use BerkeleyDB to get a good disk-bound hash.	[reply] [d/l]