Re: A memory efficient hash, trading off speed

This may be a really dumb idea, but I've used a similar method before (in BASIC) to save memory space, back when 64Kb was considered huge.

Couldn't you store the key/value pairs back to back in a single string, and then pull them out with a regex?

The format could be something like "key1:value1|key2:value2|key3:value3|" allowing you to search for the key, and if found grab it and everything up to the next | character, then split on the : character to get the corresponding value.

You could use whatever separators you wished, but there would need to be two different ones for this to work and to make it more readable for humans.

Unfortunately, I'm not confident enough with the syntax of regular expressions (yet) to give an example, but from the sounds of it you know enough to work that out for yourself and to tell if I'm spouting complete nonsense.

__________
"Every program has at least one bug and can be shortened by at least one instruction -- from which, by induction, one can deduce that every program can be reduced to one instruction which doesn't work." -- (Author Unknown)

Comment on Re: A memory efficient hash, trading off speed - does it already exist?

Replies are listed 'Best First'.
Re: Re: A memory efficient hash, trading off speed - does it already exist? by Mr_Person (Hermit) on Feb 04, 2003 at 01:29 UTC
Well, but then you run into the problem of if you want to use a colon or a pipe in one of your keys or values. I would immagine it would also be quite a bit slower for large amounts of key/value pairs than using a real hash.	[reply]
Re: Re: Re: A memory efficient hash, trading off speed - does it already exist? by Wysardry (Pilgrim) on Feb 04, 2003 at 02:05 UTC
That's why I said that any separating characters could be used, meaning that ones that didn't clash could be picked. As a Perl program would most likely be used to convert the existing hashes, characters not normally directly accessible on the keyboard (above ASCII 127) could be used. The original poster also stated that speed wasn't an issue. It wouldn't be practical for every application, but might work in this instance. A similar method was used by several text adventure games on machines with limited memory or no array facilities.	[reply]
Re: Re: A memory efficient hash, trading off speed - does it already exist? by JPaul (Hermit) on Feb 04, 2003 at 04:37 UTC
Indeed I am also considering something similar to this, although I'm not entirely sure if it would work quite as I hope. The data is constructed like so: `%hash{Keyword} -> {Symbol_1} = Weighting %hash{Keyword} -> {Symbol_2} = Weighting` [download] And so on. I think perhaps I could slim down memory considerably by doing something like the following: `%hash{Keyword} = "Symbol_1 \| Weighting, Symbol_2 \| Weighting"` [download] And, as Mr_Person has already noted, no I don't intend to use \| and , as separators. Already when writing data to the disk as a "brainsave" I use \x255 as a record separator, just like you were thinking. Perhaps the fact that I started programming in BASIC as well might have something to do with our shared mindset? :) JP, -- Alexander Widdlemouse undid his bellybutton and his bum dropped off --	[reply] [d/l] [select]
Re^3: A memory efficient hash, trading off speed - does it already exist? by Aristotle (Chancellor) on Feb 06, 2003 at 11:57 UTC
If anything, you should probably switch to a structure like `$hash{"Keyword\0Symbol_1"} = $Weighting[1]; $hash{"Keyword\0Symbol_2"} = $Weighting[2];` [download] That way you still get full hash lookup performance on the entire (two dimensional) key. And you avoid creating whole legions of hashes. That should pretty much fix 90% of your problem. And such a flat hash is trivial to tie to a DBM file. Don't knock DBMs without benchmarking them. DBM files are specifically laid out to allow for rapid retrieval - paging just indiscriminately swaps out whatever seems least necessary, including other parts of the system besides the data. Drawing any conclusions about one based on the other borders on ridiculous. Makeshifts last the longest.	[reply] [d/l]
Re: Re^3: A memory efficient hash, trading off speed - does it already exist? by JPaul (Hermit) on Feb 06, 2003 at 17:57 UTC
This requires I know what the following symbol is. The point of the hash is to look up potential following symbols of the keyword, unless there's someway to glob hash keys without going through them to compile a list, it's not going to work at all. JP, -- Alexander Widdlemouse undid his bellybutton and his bum dropped off --	[reply]
Re^5: A memory efficient hash, trading off speed - does it already exist? by Aristotle (Chancellor) on Feb 07, 2003 at 12:57 UTC
Re: Re^5: A memory efficient hash, trading off speed - does it already exist? by JPaul (Hermit) on Feb 07, 2003 at 16:01 UTC
Some notes below your chosen depth have not been shown here


Perl-Sensitive Sunglasses
	PerlMonks

Re: A memory efficient hash, trading off speed - does it already exist?