in reply to De Duping Street Addresses Fuzzily
One case you might want to watch out for that won't be caught by sorting is "Fifth St." vs. "5th St.". You could clean those up with a simple substitution hash ({ '1st' => 'First', '2nd' => 'Second', ...}), or have a look at the modules Lingua::EN::Numbers and Lingua::EN::Numbers::Ordinate and write something to handle the general case.
-b
|
|---|