in reply to 15 billion row text file and row deletes - Best Practice?
Wow! what an eye catching question, good one!
I would be wary of thinking of using grep, as bsdz and sgt have mentioned.
If you have a look at
grep -vf exclude_file to_thin_file in perl
you will see that Perl can do this much faster and with less memory than grep if the script is written efficiently.
My workmate swears by DBD::CSV - but I haven't used it.
Personally I think I'd feel safer writing to a new file in case anything went wrong while it was writing back - if it's running for a week that's a long time to risk it crashing!
|
|---|