comment on

Given the precise knowledge of the data's signature, the seeker said something about 12 byte keys e.g., one can build very fine wheels using optimized algorithms, with perl or without.

The odds are very, very high that the overhead of working in Perl would wipe out any possible win you would be able to get from knowing that the keys are 12 bytes. While the problem may sound like a fun challenge, this is an example of a common optimization mistake. If you are always looking for ways to rewrite and speed up bits of code, your overall program is almost guarateed to wind up slower than it would have been if you used good development practices. Why? Simply because you lose sight of the forest for the trees. You spend so long making your code unmaintainable that you are unable to spot the "low-hanging fruit" that inevitably provide the biggest improvements. See the sample section from Code Complete for more on this. (I recommend the whole book, but that is another story.) If you want more Perl specific optimization advice, try Re (tilly) 1: Optimizations and Efficiency.

Naturally a starting point would be to look at a DBM implementation, but I wonder why in a Seekers of Perl Wisdom section one would recommend not going the hard way and learn a lot of stuff.

Perhaps because the section is named Perl Wisdom and not Perl Masochists?

Reinventing excellent wheels that you can get for free may be good stuff for an algorithms and data structures class. Understanding this stuff may be wonderful for your evolution as a programmer. But deciding to launch into that when you just need to get something done is stupid. And it is an important lesson to learn not to bother doing that, but to instead learn to reuse existing work when and where that is appropriate.

See Modules Vs. Manual Coding for further discussion.

And if you can't teach him/her, there might either be others who can or the seeker might just go his own way and find out himself.

It would be nice if you were able to see that quote from my perspective and decide to apologize for the intended insult.

FYI the quote that you were responding to was not an admission of ignorance on my part. Rather it was a comment on how great the gap is between asking the question, "How do dbms work?" and having a chance at writing one that outperforms a good one.

If you want to disbelieve me, go ahead. In which case for someone with a Perl background and no CS background I would suggest starting at Bricolage: B-Trees and seeing how far you get. That will at least give you a key algorithm. But, for instance, that won't go into the details of how to really do it far enough to understand what any of the key parameters are that people want to tune in real dbms, let alone why they matter...

In reply to Re (tilly) 6: millions of records in a Hash by tilly
in thread millions of records in a Hash by johnkj

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.