in reply to Searching for duplication in legacy code
I wrote a text duplication checker (see Code::DRY), which uses suffix arrays for performance. It has no special knowledge of Perl or units like subs, but it can find duplicated lines quite fast. You would need a C compiler to build the libraries, but then as memory permits you can scan whole directory trees for duplicates.
I once planned to use it for a refactoring tool, but first wanted to implement the option to find structural duplicates (e.g. in token streams), where I got stuck...
Hope this helps, hexcoder
|
|---|