Mysterious Monks,

Does this sound at all reasonable to undertake on a Lamp platform?

Imagine a table of numbers. ~70 rows & 5 columns. The rows represent classes of products, the columns represent groups of businesses. A cell, the intersection of a row and column, is the cross-referenced-population of companies that belong to that group and use that class of product. For example, group = 'widget manufacturers' that use product = 'foo bars'.

You have a MySQL DB of lists of companies. The table 'widget manufacturers' lists all the companies that manufactur widgets. The table 'foo bars' lists all the companies that use foo bars. Each table includes a column for the 'company size factor' for every company in the list. To find the number that goes in the cell at their intersection, you have to run an SQL query to find the intersection of the two tables' 'company' columns, then sum the 'company size factor' for every company in the results.

There are only a few hundred business group tables, and ~12,000 product class tables. But the output table you have to generate will always only include 70 product classes and 5 groups. There are ~30,000 companies. This is a ~100Mb database, although each individual table might only be ~10Kb average, with anywhere from 10 (table 'enriched uranium')to 10,000 (table 'ballpoint pens') rows.

So to produce an output table, you have 5 x 70 = 350 intersection-and-sum sql queries to run. There are a few basic queries to run to get related data for calculations, so it's going to be ~400 queries in total.

To complete the table, there are 3 calculations to perform on each result, and report in additional columns beside the base number, plus some related calculations - so 1,500 calculations to add.

Then output it all to a browser in table format.

And of course, users want this table spit out in the blink of an eye. This is the part that I wonder if it's reasonable. The speed. This will be Perl DBI and a MySQL db. I'm not Google's development & optimization team.

I knoew there are a lot of variable, but is it within the realm of reason to imagine it possible to code something that will spit out a report in, say, under 5 to 10 seconds? Or is this obviously so impossible that I should just say so and save everyone a lot of heart ache?

Thanks




Forget that fear of gravity,
Get a little savagery in your life.

In reply to Feasability of quick generation of complex report from large db? by punch_card_don

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.