in reply to Comparing strings (exact matches) in LARGE numbers FAST
sort primary file (100s of millions) split file into ~ 10 milion record sub files named for the last key contained in each file sort secondary file -> ssf1 you may sort multiple secondary files (ssf2, ssf3 etc) perl program read directory containing primary files by name create an array containing the file names (@pfn)(last key in the ar +ray) open ssf1 ssf2 ssf3 ssf4 $ssf1_record=" "; # or a value lower than lowest value $ssf2_record=" "; # or a value lower than lowest value $ssf3_record=" "; # or a value lower than lowest value $ssf4_record=" "; # or a value lower than lowest value foreach $pf (@pfn){ open primary file ($pf) read $pf records into a hash ($pfrh) while ($ssf1_record < $pf){ compare to hash if found{ action } read $ssf1 record; } while ($ssf2_record < $pf){ compare to hash if found{ action } read $ssf2 record; } }
Enjoy!
Dageek
|
---|