When working with data and manipulating the data in various ways, always think "database"!

I am sorry but I have to object very strongly to this. Especially with the word "always".

Databases are very good for two things: data persistence and ability to manage data volumes too big for the computer memory. And also the fact that, if your database is SQL, that the SQL language is a very high level and practical language that will hide many implementation details.

But databases also have a lot of limitations. First, they are horribly slow (compared to hashes in memory). And the languages to manipulate them, such as PL-SQL, are often also horribly slow. Of course, this probably does not matter if you have just tens of thousands of records. But when you get to millions or tens of millions of records, the difference is huge.

So, if you don't need to store data in a persistent fashion, just probably don't use a database, or, at least, think twice before you do it.

About a year and a half ago, I was asked to try to improve performance of a very complicated extraction process on a database. Initial duration test led to a prospective execution time of 160 days. After some profiling and benchmarking work, I was able to reduce it to about 60 days, 59.5 of which in a very complicated trans-codification process. Not too bad, but still obviously a nogo. I moved to an extract of raw data files and a reprocessing of the flat files in pure Perl. The overall extraction time fell to about 12 or 13 hours, but the trans-codification part, using half a dozen Perl hashes, fell from 59.5 days to just about an hour, i.e. an improvement of a factor of about 1,400.

No, it is a bit more complicated. Databases are very useful, there is no doubt about it, but they are certainly not the solution to everything, far from that. Especially when performance is important.


In reply to Re^2: Finding Minimum Value by Laurent_R
in thread Finding Minimum Value by jimmy88

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.