It sounds like the major bottleneck in the process is going to be reading data from the disk. So the best way to speed this up would be to break the file down into a few chunks, and to run this job on separate machines.
You can just merge the data it produces at the end of the process.