in reply to Big database queries

What is the purpose of your query? To Summarize numeric data? A VLDB (Very Large Database) or DW (Data Warehouse) requires a different set of rules. And most the rules affect the database. Perl can only do so much until the database processing time overwhelms it. One trick to help with summary data is to pre-aggregate the summarized data in another table. Then you only query this smaller table. But your requirements determine which tables and columns are pre-aggregated. You cannot pre-aggregate all columns, so you must decide which ones are the important ones to report or query on.

Example: 1 Billion rows in a Master table of customer data called CUST. CUST contains the following important columns: zip code, state, year to date purchases.

Requirement: To summarize all the year to date sales for each state for each month.

Solution: Create a table called SUMSTATE that contains the fields that you wish to query on from the CUST table. Add a column PERIOD that is defined as date time. PERIOD will contain the month and year.

Processing Option One: On the fly, create the SUMSTATE table and populate it using a stored procedure that summarizes the data from CUST. This can be done incrementally so that the server or cluster is not impacted.

Processing Option Two: Each time a row is added or updated in CUST, trigger a stored procedure that updates SUMSTATE.

Processing Option Two is the preferred method, but each situation can be different.

Richard

There are three types of people in this world, those that can count and those that cannot. Anon