in reply to Re^2: Get unique fields from file
in thread Get unique fields from file
> Depending upon the data of course, your HoH (hash of hash) structure could consume quite a bit more memory than the actual file size in MB.
This shouldn't be a problem if you a apply a sliding window technique° plus splitting the hashes into easily swappable chunks².
The trick is to balance time, space and disk access, by minimizing the the number of swaps.
This will scale well, until the limit given by disk-space.
Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery
°) see
²) see
In Section
Seekers of Perl Wisdom