in reply to Comparing strings from different files

The suggestions to use a hash don't seem sound to me as it looks like you have plenty of records with duplicate "labels". But perhaps there is a unique identifier in there that you are aware of but haven't clearly told us about and a hash would work (if the files easily fit in RAM).

I would instead sort each file and then do a classic "merge" algorithm between the two sorted files. How to sort the files will require more knowledge about the structure and content than I can deduce from just the example data you have posted.

- tye        

  • Comment on Re: Comparing strings from different files (merge)