in reply to When the input file is huge !!!
I would use a sort utility on both files. That is available as a Unix utility. I have not used it, but Sort::External is a pure Perl version if you don't have that utility. Then process both files in parallel. With the idea being that sequences come up in the same order in both files. So you have 2 filehandles (one for each file) and 2 last lines (one for each file), and you always read from whichever one is smaller, processing a match when you find one. That way you do not keep any data in RAM.
Be warned that sorting 8 GB is liable to take 20 minutes or so on your machine.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: When the input file is huge !!!
by Marshall (Canon) on Jan 06, 2009 at 05:28 UTC | |
by tilly (Archbishop) on Jan 06, 2009 at 19:54 UTC | |
by BrowserUk (Patriarch) on Jan 06, 2009 at 21:46 UTC | |
by tilly (Archbishop) on Jan 07, 2009 at 03:01 UTC | |
by BrowserUk (Patriarch) on Jan 07, 2009 at 04:03 UTC | |
|