Use a database. This is exactly what a database is designed to do. Any work you do would simply be replicating what has literally had man-DECADES thrown at it.
Load files into a database. 2M rows and 8M rows are small-ish tables.
Run SQL queries against those tables.
My criteria for good software:
Does it work?
Can someone else come in, make a change, and be reasonably certain no bugs were introduced?