in reply to Comparing strings from different files
The suggestions to use a hash don't seem sound to me as it looks like you have plenty of records with duplicate "labels". But perhaps there is a unique identifier in there that you are aware of but haven't clearly told us about and a hash would work (if the files easily fit in RAM).
I would instead sort each file and then do a classic "merge" algorithm between the two sorted files. How to sort the files will require more knowledge about the structure and content than I can deduce from just the example data you have posted.
- tye
|
|---|