Recently I found myself wishing to store a tree in a database. A secondary goal was to do so very, very fast. (Why? Because). After much pondering and head scratching I came up with what is basically an adjacency list. Here is an example:

Give a tree that looks like:
A / \ B C / \ \ D E F
I store that in a table that looks like this:
node_id | ancestor | distance ----------------------------- B | A | 1 C | A | 1 D | B | 1 D | A | 2 E | B | 1 E | A | 2 F | C | 1 F | A | 2
Note that nodes D, E and F have multiple rows in the table.

The advantage of this lay out is that it is very, very fast to select from. I can get any number of children with a single query. For example, all of the children of B would be: SELECT * FROM table WHERE ancestor = 'B' which returns "D" and "E". The downside is that it takes more room on disk. However, some basic testing with SQLite showed me that I could fit some 5 million of these rows in a table as small as 70 megs, so I'm not overly worried about that.

As I've described it, my table works great. Selecting is very fast, I can find any level I want easily, it's fast to add new ones and so on. My single problem is that I figure out how to order my nodes once I've selected them. Give the above tree/table, my ideal order looks like: A, B, D, E, C, F.

So I get to my question. How can I sort my nodes so that they come out in the order I want? I'd definitely prefer something that would be pure sql, mostly for speed purposes. I'm also open to suggestions of what I could add to my table to enable easier/faster ordering, or perhaps a better way to store this.

I've considered several other methods of storing trees in my database. The first method I considered was simply having each node store a parent node. This is space efficient but very slow, a simple 10 node tree could take up to 10 queries to return all the nodes. The second thought was using Celko's left/right number tree, which is fast to select from and fairly easy to utilize, but when you add a node it requires updating every other node in the tree, I thought this would be inefficient for large trees.

Update: You can insert at any point in the tree at any time or move any node to any other spot.

In reply to Trees in SQL, adjacency lists and sorting. by BUU

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.