Re: Techniques On Saving Memory
by BrowserUk (Patriarch) on Mar 09, 2005 at 18:55 UTC
|
Take a look at Devel::Pointer. It has routines for unsmashing smashed references. It works, and I have used it to do something very similar to what you are attempting.
One problem is that you also need to store the type of ref (scalar/array/hash/code) along with the smashed address. That screws the otherwise convenient 32-bit / element storage requirement making the math messier than it ought to be.
The second is you also need to store unsmashed copies of the references so that perl will not GC the elements to which you are storing smashed refs.
That duplication of storage negates much of the benefit, combined with the fact that you are now responsible for managing the memory instead of allowing Perl to take care of it, makes for a heavier and riskier module than might otherwise be the case.
I have the germs of an idea for dealing with that also, but the tools are not available to me to implement it (yet).
Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.
| [reply] |
|
|
you also need to store unsmashed copies of the references so that perl will not GC the elements to which you are storing smashed refs
Ooo, ouch. Good point.
I think you will find, case after case, that by the time you're done "improving" a perl feature, you're going to have a solution that is as complex (in memory and time and code and shortcomings) as the perl implementation itself.
-- [ e d @ h a l l e y . c c ]
| [reply] |
|
|
I've been trying to get this idea right for almost two years now. On two occasions I've even reached the point where I thought I was ready to let code loose on the world, but then I discovered something which I wasn't able to handle without duplicating the references in a hash, which completely negates any benefit at all.
As I said, I have the germ of a solution, but as I am entirely uncomfortable in using XS, or even XS modules in ways that move beyond their authors intent, it will remain just an idea until I can think of a way of proving it's safety.
Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.
| [reply] |
|
|
sub compact {
my $ref = shift;
my $pck; # Packed information except reference
# purposefully avoiding dispatch table
# This keeps us at HoC
return sub {
if ( $_[0] eq 'data' ) { return $ref };
elsif ( $_[0] eq 'blah' ) { return unpack('A1', $pck) }
# where each unpack format gets desired data
};
}
$hash{foo} = compact( [1..10] );
$hash{foo}->( 'data' )[3] = 42;
The interface could use work and it doesn't really explore any uncharted territory for me. It also is still not a single hash. We are just trading anonymous subs for anonymous arrays.
| [reply] [d/l] |
|
|
| [reply] |
|
|
BrowserUk,
Thinking out loud here with only the most elementary understanding of Perl internals - why can't you fake a closure? Basically increase the ref count without actually making the closure. You GC it by decreasing it. As long as you control the creation/modification/deletion of the hash keys and their values, I don't see an obvious problem?
| [reply] |
|
|
... why can't you fake a closure?...
Sorry, but I don't even begin to see how to fake a closure? And how would you unfake it?
As long as you control the creation/modification/deletion of the hash keys and their values, ...
What hash? Why use a hash and how would that save memory if you do the Tie::RefHash thing of storing the smashed and unsmashed references?
If you do the RefHash thing, you don't need to do any controlling--once you delete the (value) reference, Perl takes care of the rest. But trading a hash for an array is not going to save memory.
Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.
| [reply] |
|
|
|
|
|
|
|
| [reply] |
Re: Techniques On Saving Memory
by japhy (Canon) on Mar 09, 2005 at 18:45 UTC
|
Voila! This was written in about 30 minutes (well, 5 minutes of re-reading the XS docs, 10 minutes of actual coding time, and 15 minutes of writing docs and tests and what-not). (Update: another 10 minutes produced version 0.02, which makes one function do the job of two.) Limbic~Region was asking about such a module on the CB, and when I described what I thought such a module should do, I decided to try writing it.
Devel::RecoverRef is a module which takes a stringified or numified version of a reference (objects included!) and returns the reference (or object!) that is represented therein. There's not a lot of protection, though. You'll probably dump core if you make stuff up. Scrutinize, and I'll upload it to CPAN by the end of the week.
_____________________________________________________
Jeff japhy Pinyan,
P.L., P.M., P.O.D, X.S.:
Perl,
regex,
and perl
hacker
How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
| [reply] |
|
|
Is there some reason you didn't just write this as a wrapper over Devel::Pointer or Devel::Pointer::PP? They already handle the unsmashing part - you just add the part to get the hex address out of strings like "Some::Class=HASH(0x81471c8)". Or why didn't you just ask either Simon or me to update our modules? Its probably a feature that should already be in there.
| [reply] |
|
|
I hadn't known of those modules until after I wrote mine. They were mentioned in other nodes in this thread. Now that I know they exist, there's no need for me to release mine; you and Simon just need to simplify your modules.
_____________________________________________________
Jeff japhy Pinyan,
P.L., P.M., P.O.D, X.S.:
Perl,
regex,
and perl
hacker
How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
| [reply] |
|
|
|
|
|
Re: Techniques On Saving Memory
by bart (Canon) on Mar 09, 2005 at 19:16 UTC
|
I'm not sure what you mean by "reference", but pack/unpack do contain support for two kinds of templates, "P" and "p", that can be used to reference/dereference some block of memory. So if what you want is a (zero-terminated) string or a packed, fixed length structure, you can use one of these. You must be really careful not to move your strings containing the packed data around, or you'll end up dereferencing the wrong memory block. So no changing the length of the scalar holding that block, once you've started packing.
I thought of using substr to point inside a string, but it doesn't work; but the following gives some form of workaround:
$x = "one\0two\0";
$b = pack "p", $x; # pointer to "one"
$b .= pack 'I', 4 + unpack 'I', $b; # pointer to "two" = substr($x, 4)
printf "length: %i\n", length $b;
printf "%i %i\n", unpack "II", $b;
printf "%s\n", $_ for unpack "pp", $b;
Result:
length: 8
2279940 2279944
one
two
Like I said, $x contains the packed data, so don't move that around.
For unpacking a fixed length string, you can use "P12" for a 12 byte string, for example. Appending the following snippet to the above code, I get the following additional result:
printf "%s\n", $_ for unpack "P2P2", $b;
on
tw
Note: the unpacked data is a copy of the data, not a reference into the same, original data. | [reply] [d/l] [select] |
Re: Techniques On Saving Memory
by Roy Johnson (Monsignor) on Mar 09, 2005 at 19:17 UTC
|
Looking at the problem differently, since you know how big each array is, you can use a hash and one array. In the hash, you store the offset to the beginning of the desired (conceptual) row in the array.
use strict;
use warnings;
my %hash;
my @array;
my $row_length = 5;
my $next_available = 0;
for (qw(cow pig donkey)) {
$hash{$_} = $next_available;
for my $col (0..($row_length - 1)) {
$array[$next_available + $col] = "$_ $col";
}
$next_available += $row_length;
}
# What would have been $hash{pig}[3]:
print $array[$hash{pig}+3];
You could even make a class implementing hashes of fixed-size arrays and give it an indexing method. I'm not going to. :-)
Caution: Contents may have been coded under pressure.
| [reply] [d/l] |
Re: Techniques On Saving Memory
by diotalevi (Canon) on Mar 09, 2005 at 23:44 UTC
|
Most of the discussion by other people seems to be about how to represent the same data in a smaller space. Here is another idea - instead of storing stringifid references, store them as packed data: pack( 'N', $str =~ /x([A-F\d]+)/ ? hex( $1 ) : die ). Your string is now four bytes long instead of at least twice that. You'd unsmash it using some "normal" method. You could also compress your data. I've stored things in memory as six bit characters when I knew I only wanted to use a limited character set and wanted to get the 25% space savings. Similarly with numbers, packing them can yield a shorter binary value than they are when represented as strings. You shoul also consider whether your algorithm can be changed so that less data has to be in memory at any given time. This is a really good reason to check out Dominus new book Higher Order Perl to get some training on using streams and iterators. Iterators are useful tools to use when altering your code so that less stuff is actually present at any given moment. Offloading things to disk is another way to save memory. Use BerkeleyDB to get a good disk-bound hash.
| [reply] [d/l] |