I created a 500k line file and ran the code suggested by rir on my winXP 1.1GHz 256Mb laptop and it took 57 seconds. I realise that because all the records are the same this is not a very scientific test.
Without the timelocal call it was 26 seconds.
However, 19 minutes does seems a long time. Maybe memory or disk IO is having an significant effect.