in reply to Finding similar data

grinder's most excellent Regexp::Assemble can certainly help you with part of your problem, identifying bits common to multiple words. Take a look at Why machine-generated solutions will never cease to amaze me for a sample of what this module can do; you'll be impressed.

Oh, and grinder's scratchpad too...

planetscape