in reply to Searching for duplication in legacy code

Hi,

I wrote a text duplication checker (see Code::DRY), which uses suffix arrays for performance. It has no special knowledge of Perl or units like subs, but it can find duplicated lines quite fast. You would need a C compiler to build the libraries, but then as memory permits you can scan whole directory trees for duplicates.

I once planned to use it for a refactoring tool, but first wanted to implement the option to find structural duplicates (e.g. in token streams), where I got stuck...

Hope this helps, hexcoder

  • Comment on Re: Searching for duplication in legacy code