in reply to Re^2: Processing large file using threads
in thread Processing large file using threads
Checking 21 million urls shouldn't really take all that much time, and reading them all from a file shouldn't take that long either; that's only about a Gigabyte of data. I am assuming you have significantly less hosts than urls, or you'd possibly need lot (i.e. more than 4 Gb) of system memory, with the algorithm you've outlined above.
ps: why isn't all this data in a database? provided you've already linked/split up the hosts from the urls, you can do this kind of query in a single line of SQL. and it'll probably be pretty fast too.
|
|---|