Use the command line sort to sort both files. Then you can "walk" through both of them in one pass without having to read file B, 5 million times! B appears to be small enough to fit into memory, you could just make a hash, but I take it that the "unique id" id's really aren't unique and you can have multiple values for each id, if so then just make a HoA.