in reply to How to create a compact data structure?

My approach with such things is to use a tied hash (Berkeley DB in my case), and to pack and unpack the flag and count. It works well enough and is simple enough so I don't make ugly programming errors:

my $mask = "NN"; foreach my $item (LARGE_LIST) { $key = property($item); my $val = $record{ $key } || pack $mask, 0,0; my ($count, $flag) = unpack $mask, $val; $count++; $flag ||= condition_is_true($item); $record{ $key } = pack $mask, $count, $flag; }

Even without using a tied hash, you'll almost cut your memory use by half as you don't need to allocate the anonymous hash and the scalar for the flag anymore.

Replies are listed 'Best First'.
Re^2: How to create a compact data structure?
by shmem (Chancellor) on Dec 19, 2006 at 23:08 UTC
    Since the flag is boolean, why allocate a long for it? A byte should do.
    my $mask ="NC";

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re^2: How to create a compact data structure?
by bart (Canon) on Dec 19, 2006 at 20:58 UTC
    My approach with such things is to use a tied hash (Berkeley DB in my case)

    Can you give me an idea on the speed of using a Berkeley DB? Before even bothering to try installing it, I'd like to know by how much it would slow down.

      Can you give me an idea on the speed of using a Berkeley DB?

      Not without profiling your application. You do take the tie hit on hash accesses, but if your working set is so large that it won't fit into memory at once, you avoid thrashing your virtual memory system.

      No matter what approach you take, you'll likely spend more time managing your data structures in an appropriate format, so you'll always pay a penalty over the naïve approach. Your goal is to amortize that over all operations to avoid a sudden slowdown at the end.