Greetings all;
I have a piece of code I've been fiddling around with thats designed to emulate natural speech, learning from users input. (Very simply, a learning chatterbox).

I've been surprised by how much memory the data takes up, given how small it is when written to disk. I use twin hashes, storing practically the same data, but in a different order. The script learns a sentence in two directions (front to back, back to front) so it can generate a sentence in either direction from a given keyword.
Right now each hash, on disk, takes up 727k (1.4M "brain") - but when loaded into the hash, takes up a remarkable 16M! (I've loaded the software without data to verify).
My hash is put together like so:

$VAR1 = { 'Word1_Word2' => { 'Sym1' => 3, 'Sym2' => 1 }, 'Word3_Word4' => { 'Sym4' => 3, 'Sym3' => 1 }, 'Word5_Word6' => { 'Sym5' => 1 } };
For comparison, I write every entry to disk in the format:
Word1 \a Word2 \00 Sym1 \00 3 \n
Can you fine gentlemonks suggest a better way of storing data in memory, while also being easy to reference?
My thanks,

JP,
-- Alexander Widdlemouse undid his bellybutton and his bum dropped off --


In reply to A more memory efficient storage structure? by JPaul

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.