You could also shrink the memory usage by computing your own hash value and using that as the %seen key -- but I don't think I'm going to get into any more details unless the original poster swears that this has nothing to do with harvesting addresses for spammers.while (<>) { print unless $seen{$_}++; }
In reply to Re: Re: Removing duplicates in large files (a hash, or divide-and-conquer)
by sfink
in thread Removing duplicates in large files
by TIURIC
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |