comment on

Dear Monks,

I am looking to reduce the startup time and also the memory impact of an application which uses a large hash of hashes to hold data.

One obvious way occurred to me and that was to tie the hash to a file / database, so that i wouldn't have to repopulate it with every run of the program and also that it wouldn't be held in memory. Unfortunately I don't have a lot of experience in these matters.

Looking around CPAN and the monastery, folks seemed very keen on three main candidates:

Tie::DBI

Tie hashes to SQL databases

DBM::Deep (Re: Can I tie a hash of hashes to a file?)

"A unique flat-file database module, written in pure perl"

BerkleyDB::Hash (Re: Managing a graph with large number of nodes)

BerkelyDB based obv.!

I wanted to write some benchmarks to test which was the fastest, most space efficient, and simplest to use, but never really got very far:

Tie::DBI

DBI

While you can subsequently use the tied hash like a normal hash, doing the tie is fairly complex and requires messing around with flags

BerkeleyDB::Hash

Again, you must pre-create the database and mess around with a lot of setup flags, just to get it working...

I was put off, because I just wanted a simple 'plug and play' solution... However:

DBM::Deep

Very simple interface

"True multi-level hash/array support (unlike MLDBM, which is faked), hybrid OO / tie() interface, cross-platform FTPable files, ACID transactions, and is quite fast. Can handle millions of keys and unlimited levels without significant slow-down. Written from the ground-up in pure perl -- this is NOT a wrapper around a C-based DBM. Out-of-the-box compatibility with Unix, Mac OS X and Windows."

tie %hash, "DBM::Deep", {
      file => "foo.db",
      locking => 1,
      autoflush => 1
  };
[download]

So my questions:

am I being stupid / impatient / missing a serious performance perk / lazy / etc... by discounting everything but DBM::Deep?

Are there any other other approaches I should consider, which offer the same advantages of DBM::Deep, but do it in a sufficiently different way to warrant Benchmarking?

And for those that have experience of it - are there any pitfalls i should be aware of from DBM::Deep?

Finally - am I even on the right track to solving my original problem (see 500 lines above...)!?

Sorry for the long question and thanks in advance!

Just a something something...

In reply to Tie Hash by BioLion

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.