I'm very happy using gvimdiff between two files to get visual feedback on what parts of the file are identical. And if you find that a bunch of the scripts are identical from lines 200-400, say, it should be possible to centralize those routines in a module.
Writing tests for this kind of script can be a challenge, but setting up some test data files with as many possible corner cases should go a long way to solving that.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds