Dear Monks,
I am looking to reduce the startup time and also the memory impact of an application which uses a large hash of hashes to hold data.
One obvious way occurred to me and that was to tie the hash to a file / database, so that i wouldn't have to repopulate it with every run of the program and also that it wouldn't be held in memory. Unfortunately I don't have a lot of experience in these matters.
Looking around CPAN and the monastery, folks seemed very keen on three main candidates:
Tie::DBITie hashes to SQL databases
DBM::Deep (Re: Can I tie a hash of hashes to a file?)"A unique flat-file database module, written in pure perl"
BerkleyDB::Hash (Re: Managing a graph with large number of nodes)BerkelyDB based obv.!
I wanted to write some benchmarks to test which was the fastest, most space efficient, and simplest to use, but never really got very far:
Tie::DBI
You must pre-create the database, using DBI
While you can subsequently use the tied hash like a normal hash, doing the tie is fairly complex and requires messing around with flags
BerkeleyDB::Hash
Again, you must pre-create the database and mess around with a lot of setup flags, just to get it working...
I was put off, because I just wanted a simple 'plug and play' solution... However:
DBM::Deep
Very simple interface
"True multi-level hash/array support (unlike MLDBM, which is faked), hybrid OO / tie() interface, cross-platform FTPable files, ACID transactions, and is quite fast. Can handle millions of keys and unlimited levels without significant slow-down. Written from the ground-up in pure perl -- this is NOT a wrapper around a C-based DBM. Out-of-the-box compatibility with Unix, Mac OS X and Windows."
Relatively simple flags (but more if you want to go deeper):
tie %hash, "DBM::Deep", {
file => "foo.db",
locking => 1,
autoflush => 1
};
So my questions:
am I being stupid / impatient / missing a serious performance perk / lazy / etc... by discounting everything but DBM::Deep? Are there any other other approaches I should consider, which offer the same advantages of DBM::Deep, but do it in a sufficiently different way to warrant Benchmarking? And for those that have experience of it - are there any pitfalls i should be aware of from DBM::Deep?
Finally - am I even on the right track to solving my original problem (see 500 lines above...)!?
Sorry for the long question and thanks in advance!
Just a something something...
In reply to Tie Hash
by BioLion
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.