Recently I found myself wishing to store a tree in a database. A secondary goal was to do so very, very fast. (Why? Because). After much pondering and head scratching I came up with what is basically an adjacency list. Here is an example:
Give a tree that looks like:
A
/ \
B C
/ \ \
D E F
I store that in a table that looks like this:
node_id | ancestor | distance
-----------------------------
B | A | 1
C | A | 1
D | B | 1
D | A | 2
E | B | 1
E | A | 2
F | C | 1
F | A | 2
Note that nodes D, E and F have multiple rows in the table.
The advantage of this lay out is that it is very, very fast to select from. I can get any number of children with a single query. For example, all of the children of B would be:
SELECT * FROM table WHERE ancestor = 'B' which returns "D" and "E". The downside is that it takes more room on disk. However, some basic testing with SQLite showed me that I could fit some 5 million of these rows in a table as small as 70 megs, so I'm not overly worried about that.
As I've described it, my table works great. Selecting is very fast, I can find any level I want easily, it's fast to add new ones and so on. My single problem is that I figure out how to order my nodes once I've selected them. Give the above tree/table, my ideal order looks like: A, B, D, E, C, F.
So I get to my question. How can I sort my nodes so that they come out in the order I want? I'd definitely prefer something that would be pure sql, mostly for speed purposes. I'm also open to suggestions of what I could add to my table to enable easier/faster ordering, or perhaps a better way to store this.
I've considered several other methods of storing trees in my database. The first method I considered was simply having each node store a parent node. This is space efficient but very slow, a simple 10 node tree could take up to 10 queries to return all the nodes. The second thought was using Celko's left/right number tree, which is fast to select from and fairly easy to utilize, but when you add a node it requires updating every other node in the tree, I thought this would be inefficient for large trees.
Update: You can insert at any point in the tree at any time or move any node to any other spot.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.