Re: 15 billion row text file and row deletes

nice thread. I would treat the file as a database

a unix solution would be

cut the kill file in k chunks

fast string matching on serial file via grep

grep -nF -f chunk serial_file > delete_file # you need k steps!

perl gets line numbers n from delete_file, for each n susbstitute in serial the line by "X" x length $_

if latter you add entries put it at the first X line that fits or else append

(update) if you need to repeat the process a few times, maybe it's worth to sort the serial file

Comment on Re: 15 billion row text file and row deletes - Best Practice?