sort primary file (100s of millions) split file into ~ 10 milion record sub files named for the last key contained in each file sort secondary file -> ssf1 you may sort multiple secondary files (ssf2, ssf3 etc) perl program read directory containing primary files by name create an array containing the file names (@pfn)(last key in the ar +ray) open ssf1 ssf2 ssf3 ssf4 $ssf1_record=" "; # or a value lower than lowest value $ssf2_record=" "; # or a value lower than lowest value $ssf3_record=" "; # or a value lower than lowest value $ssf4_record=" "; # or a value lower than lowest value foreach $pf (@pfn){ open primary file ($pf) read $pf records into a hash ($pfrh) while ($ssf1_record < $pf){ compare to hash if found{ action } read $ssf1 record; } while ($ssf2_record < $pf){ compare to hash if found{ action } read $ssf2 record; } }
Enjoy!
Dageek
In reply to Re: Comparing strings (exact matches) in LARGE numbers FAST
by johndageek
in thread Comparing strings (exact matches) in LARGE numbers FAST
by perlSD
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |