Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re^3: Get unique fields from file

by LanX (Sage)
on Jan 08, 2022 at 15:27 UTC ( #11140269=note: print w/replies, xml ) Need Help??

in reply to Re^2: Get unique fields from file
in thread Get unique fields from file

> Depending upon the data of course, your HoH (hash of hash) structure could consume quite a bit more memory than the actual file size in MB.

This shouldn't be a problem if you a apply a sliding window technique° plus splitting the hashes into easily swappable chunks².

The trick is to balance time, space and disk access, by minimizing the the number of swaps.

This will scale well, until the limit given by disk-space.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

°) see

²) see

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11140269]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2022-08-10 07:15 GMT
Find Nodes?
    Voting Booth?

    No recent polls found