comment on

But I'm not clear on how we went from talking about 250milliseconds-per-query in paragraph 2, then 250 queries-per- millisecond in paragraph 4

Your right. The post was written in two stages. Originally it was based on a few lines of code that I threw togther to test the idea out. No subroutines (or their overhead). Only positive match detection. Much smaller datasets. It worked and I starting writing the post on that basis. Then I realised that it was way too limited in the types of questions it could answer and the hard coded scale of the test was limiting, so I went back and improved things.

The numbers in paragraph 4 are leftovers from the original, artificially simpler, but faster tests. I will update the node.

As an aside, the same technique can be applied even to datasets where the answers are not yes/no, provided the range of answers can be reduced to a reasonable range of discrete values. Ie. multiple choice as you are doing.

All too often you see applications storing things like dates, ages & other forms of continuously variable numeric values, when all that is really required for the application is "Under 18 | over 18 and under 65 | over 65" and similar, that can be easily replaced by an enumeration. Many DBs can handle these quite efficiently.

Unfortunately, they also tend to apply arbitrary limits to the size of an enumeration, 32 or 64 etc. The shame is that in many cases, the limits for the number of indexes that may be applied to a given table (MySQL:32, DB2:(was)255), coincide. In many cases, the use of large enumeration types could substitute for large numbers of indexes with conciderable better efficiency. They can also be used to remove the need for foriegn keys in some cases, for another hike in efficiency.

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

In reply to Re^2: Basic Perl trumps DBI? Or my poor DB design? by BrowserUk
in thread Basic Perl trumps DBI? Or my poor DB design? by punch_card_don

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.