One case you might want to watch out for that won't be caught by sorting is "Fifth St." vs. "5th St.". You could clean those up with a simple substitution hash ({ '1st' => 'First', '2nd' => 'Second', ...}), or have a look at the modules Lingua::EN::Numbers and Lingua::EN::Numbers::Ordinate and write something to handle the general case.
-b
In reply to Re: De Duping Street Addresses Fuzzily
by bgreenlee
in thread De Duping Street Addresses Fuzzily
by patrickrock
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |