I didn't have time to post this right away this morning, but here's at least some thoughts about efficiently measuring the "distance" between two nodes.
Now I see you will probably go in a simpler direction, but I'm posting this anyway in case you (or anyone else) feel the urge to implement such a structure.
Once you have a suitable balanced binary tree, you can augment it with threading to get a fast inorder traversal starting from any point. You can also augment it in the following way so that given two keys, you can find them in the tree in the normal way, but at the same time compute the "distance" between them at no extra cost.
Add two slots for each node of the tree, to store the number of left descendants and right descendants. When inserting, you can do the increments while traversing the tree to find the insertion spot, just before following a child pointer. When deleting, you decrement appropriate values all the way down to the node. If your balanced tree implementation requires e.g. rotations every once in a while, bookkeeping the number of left/right descendants will be easy to maintain for these operations as well.
Now, using this extra bookkeeping, here's how you can get pointers to X and Y (X < Y) and at the same time get the number of nodes "between" X and Y:
- In parallel, keep track of two node pointers, PX and PY, and traverse down the tree to find X and Y respectively.
- PX and PY both start at the root. At the first point in time where they diverge (PX = PY but PX is about to turn left and PY is about to turn right), set N=1
- Thereafter, every time PX is about to turn left, add (1+PX.right_descendants) to N. And every time PY is about to turn right, add (1+PY.left_descendants) to N.
- When PX and PY finally reach X and Y respectively, the value N will be the number of nodes between X and Y in the ordering.
Alternatively, if you already have pointers to X & Y, you could do this "in reverse", working back up towards the root. But I found it simpler to describe working from the top down. Anyway, this gives a O(log n) algorithm for their "distance".