The general term for what you are doing overall is called record linkage.
The particular phase you are trying optimize might be called candidate record pruning.
I learned a lot by reading the docs for FEBRL. They used Markov Modelling to prune the search space. Also there are some good approximate matching methods there not implemented in pure Perl (such as Jaro-Winkler).