Dear Monks,

I am looking to reduce the startup time and also the memory impact of an application which uses a large hash of hashes to hold data.

One obvious way occurred to me and that was to tie the hash to a file / database, so that i wouldn't have to repopulate it with every run of the program and also that it wouldn't be held in memory. Unfortunately I don't have a lot of experience in these matters.

Looking around CPAN and the monastery, folks seemed very keen on three main candidates:

  • Tie::DBI
      Tie hashes to SQL databases
  • DBM::Deep (Re: Can I tie a hash of hashes to a file?)
      "A unique flat-file database module, written in pure perl"
  • BerkleyDB::Hash (Re: Managing a graph with large number of nodes)
      BerkelyDB based obv.!
  • I wanted to write some benchmarks to test which was the fastest, most space efficient, and simplest to use, but never really got very far:
  • Tie::DBI
      You must pre-create the database, using DBI
      While you can subsequently use the tied hash like a normal hash, doing the tie is fairly complex and requires messing around with flags
  • BerkeleyDB::Hash
      Again, you must pre-create the database and mess around with a lot of setup flags, just to get it working...
  • I was put off, because I just wanted a simple 'plug and play' solution... However:
  • DBM::Deep
      Very simple interface
      "True multi-level hash/array support (unlike MLDBM, which is faked), hybrid OO / tie() interface, cross-platform FTPable files, ACID transactions, and is quite fast. Can handle millions of keys and unlimited levels without significant slow-down. Written from the ground-up in pure perl -- this is NOT a wrapper around a C-based DBM. Out-of-the-box compatibility with Unix, Mac OS X and Windows."
      Relatively simple flags (but more if you want to go deeper):
      tie %hash, "DBM::Deep", { file => "foo.db", locking => 1, autoflush => 1 };
  • So my questions:
  • am I being stupid / impatient / missing a serious performance perk / lazy / etc... by discounting everything but DBM::Deep?
  • Are there any other other approaches I should consider, which offer the same advantages of DBM::Deep, but do it in a sufficiently different way to warrant Benchmarking?
  • And for those that have experience of it - are there any pitfalls i should be aware of from DBM::Deep?
  • Finally - am I even on the right track to solving my original problem (see 500 lines above...)!?
  • Sorry for the long question and thanks in advance!

    Just a something something...

    In reply to Tie Hash by BioLion

    Title:
    Use:  <p> text here (a paragraph) </p>
    and:  <code> code here </code>
    to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.