in reply to compare 2 files and return the number of similar sentences

Maybe you know or can make some assumptions about the general conditions required for your program.

Your code (with the above mentioned correction for using 'eq' instead of '==') might be already ok; it might be not ok if the files to compare could be large/huge.

Then some other possible restrictions/caveats would come up:

- would there be any restrictions on how much memory is available ?
- would it be important to get a quick answer or is it no problem to let the processing take some time ?
- would 'similarity' mean 'equality' (as in your code), or would a somewhat 'softer' test be needed (allowing for a variable number of whitespaces, linebreaks, casing, ...) ?

Considering such issues would make it possible to know if the solution drafted above is perfectly enough or if it would need refinements.
  • Comment on Re: compare 2 files and return the number of similar sentences

Replies are listed 'Best First'.
Re^2: compare 2 files and return the number of similar sentences
by barrymcv (Initiate) on Apr 20, 2007 at 13:36 UTC
    Thanks for the quick responses. Memory + time shouldn't be an issue and the files will be small. Keeping it simple so I would just like to find identical sentences. Is it difficult to pass the $MatchCount variable back into a php script? Thanks again.