If you have 400,000 unique characteristic sets among the 8 million patients, now you're getting somewhere. If you could find a way to consistently stringify a given set the same way each time it comes up, you could turn that into a hash key, and as its value create a datastructure of patient names. Now you have a workable structure that could be split into manageable files based on the groupings by unique characteristics.
...just a thought, though I'm not sure how helpful it is.
| [reply] |