I have some data I'm going to be parsing to store in a database. This data has a dozen to two dozen fields. The problem is that not every "record" has all of these fields, so one record might have 12 fields, and the next one only has 5 and the next has 24. So my problem is, how should I design a table to accomodate this?

The answer is 'depends'.

1.) Depends upon how big your data set is going to get

2.) Depends on how the data is going to be retrieved (through occasional big queries, or short small ones.

3.) Depends on the type of data the fields are (null Integers don't take up a lot of room, for example, so no reason to worry about them being empty).

I've gone different ways, depending on the above

In general, though, if you're 30+ columns are a normalized set of data, and the 'sometimes' values aren't that long, then you really can't go wrong with KISS. It's the best way to ensure the most long-term variability of use.

If you know that you're rarely going to do 'long batch queries', and this is more for a fast-retrieval of single records, then Option 2 isn't a bad choice. Option 3 even works as long as your '5 or so fields' are the one you match on most often.

A variation on Option 3 that I've used *once* in my life, for a very specific design, is to create two tables, with a one-to-one required relationship. I put the smaller, searchable subset in table one, and the rest (mostly message headers, various textual fields, etc...) into the other table.

That served two purposes:

1.) The main table was small, compact, and highly indexed, so searching, exporting, debugging, etc, was a snap.

2.) The second table, after some period of time, was determined to not be necessary, except for two fields, so it was easy to move the two fields to main table and just drop the second one.

Hope some of that's useful

Trek


In reply to Re: (OT) Database design: omitted partially filled fields. by TrekNoid
in thread (OT) Database design: omitted partially filled fields. by BUU

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.