For a file large enough to be a problem, Perl should be reading in one line at a time and loading it into a database
As usual, you'd need to benchmark the specific application to know which is faster.
In this thread, where some suggested using an external database instead of a gigantic Perl hash, I remember this quote from BrowserUk:
I've run Perl's hashes up to 30 billion keys/2 terabytes (ram) and they are 1 to 2 orders of magnitude faster, and ~1/3rd the size of storing the same data (64-bit integers) in an sqlite memory-based DB. And the performance difference increases as the size grows. Part of the difference is that however fast the C/C++ DB code is, calling into it from Perl, adds a layer of unavoidable overhead that Perl's built-in hashes do not have.
In Re: Fastest way to lookup a point in a set, when asked if he tried a database, erix replied: "I did. It was so spectacularly much slower that I didn't bother posting it".
In Re: Memory efficient way to deal with really large arrays? by Tux, Perl benchmarked way faster than every database tried (SQLite, Pg, mysql, MariaDB).
With memory relentlessly getting bigger and cheaper (a DDR4 DIMM can hold up to 64 GB while DDR5 octuples that to 512 GB) doing everything in memory with huge arrays and hashes is becoming more practical over time.
In reply to Re^4: How can I keep the first occurrence from duplicated strings?
by eyepopslikeamosquito
in thread How can I keep the first occurrence from duplicated strings?
by Anonymous Monk
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |