in reply to Search a hash for closest match

Another possibility is to try to match a "handful" of records at the same time as you are passing through the list sequentially. And there might be a pretty good argument against using a fancy-tree lookup structure here ... that argument being "page faults." If you are randomly probing a large structure then you are also randomly incurring page faults. It would therefore be best to make the most of each virtual-memory page once you have incurred the cost of paging it in. Whatever pre-construction testing you do, be sure to do it under realistic loads that actually cause page-fault activity to occur on the target machine. (And if memory is capacious enough that it doesn't, then any ol' algorithm will do.)