comment on

For a file large enough to be a problem, Perl should be reading in one line at a time and loading it into a database

As usual, you'd need to benchmark the specific application to know which is faster.

In this thread, where some suggested using an external database instead of a gigantic Perl hash, I remember this quote from BrowserUk:

I've run Perl's hashes up to 30 billion keys/2 terabytes (ram) and they are 1 to 2 orders of magnitude faster, and ~1/3rd the size of storing the same data (64-bit integers) in an sqlite memory-based DB. And the performance difference increases as the size grows. Part of the difference is that however fast the C/C++ DB code is, calling into it from Perl, adds a layer of unavoidable overhead that Perl's built-in hashes do not have.

In Re: Fastest way to lookup a point in a set, when asked if he tried a database, erix replied: "I did. It was so spectacularly much slower that I didn't bother posting it".

In Re: Memory efficient way to deal with really large arrays? by Tux, Perl benchmarked way faster than every database tried (SQLite, Pg, mysql, MariaDB).

With memory relentlessly getting bigger and cheaper (a DDR4 DIMM can hold up to 64 GB while DDR5 octuples that to 512 GB) doing everything in memory with huge arrays and hashes is becoming more practical over time.

In reply to Re^4: How can I keep the first occurrence from duplicated strings? by eyepopslikeamosquito
in thread How can I keep the first occurrence from duplicated strings? by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.