The issues in optimizing performance for data storage are primarily hardware/operating-system driven. A partial list of the issues/thoughts/concerns would include:

Your questions are actually the same question, from different directions. There is a better way of structuring the data, and that's by keeping the metadata around. Basically, the most important piece of metadata you want is the rowsize. That tells you where the data is that you're looking for. You also want to keep some set of indices which allow you to quickly determine which row numbers have what data in what column. As for storing this metadata ... most datastores keep a separate storage area for this. Oracle keeps it as part of the data file and MySQL keeps it as a separate file (at least for MyISAM tables).

Column types, sizes, referential integrity ... that's all used to aid the developer. SQLite is a good example of a datastore that doesn't make any use of that stuff. (Well, very, very little use.) Column types and sizes can also help the datastore in some optimizations when calculating rowsize. (q.v. above as to why this is important.)

Oh - if you want any sort of decent performance, you will end up rewriting this in C.

This won't help in the actual implementation, but look for stuff written by Fabian Pascal on some theory behind efficient data storage, especially with the relational model. He has a lot of ... good ... articles on the web.

Updates:Wording changes on the sidenote about compression on a per-row basis. Added an example of savings when using data types.

------
We are the carpenters and bricklayers of the Information Age.

Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

I shouldn't have to say this, but any code, unless otherwise stated, is untested


In reply to Re: (Real) Database Design by dragonchild
in thread (Real) Database Design by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.