The other comments mostly boil down to "use a database," but you only have 300k records. You want to ask yourself whether or not they will all fit in memory; if so, you can just read file 1 into a hash. If you have a machine with 2G memory available to your script, then you have about 6k available per record.