Your title made me think of something I'd been investigating recently: code similarity analyzers (I first encountered that term here). You may also find the thread I cross-referenced to contain some interesting ideas.
The subject of text mining inevitably comes up, and text mining and Fingerprinting text documents for approximate comparison may give you some useful ideas and/or resources.
Recently, when I needed to do some keywording/summarizing, I hacked together a wee little script using (among other things):
... and it worked pretty well, actually. :-)
HTH,
planetscapeIn reply to Re: Similar text search
by planetscape
in thread Similar text search
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |