We've got an existing system which loads data by doing individual SQL queries for each record read, and caching results. We have multiple processors on our HP/UX system, so we take advantage of this by running multiple copies of our program, but the results are still far too slow.
Your problem has nothing to do with configuration data or shared memory or any crap like that. Have you benchmarked where your bottlenecks are? It doesn't sound like you have.
The good news is that I don't have to - your bottlenecks are in the SQL reads and writes. I will bet that you don't have good indices, that your loads are updating indices every time, and that you can increase your throughput 100-fold if you had a DBA consultant with 10years experience come in for 2 weeks and audit your system.
A few items for you to look at:
- Are your tables normalized?
- Are you using the RDBMS's data extraction and loading features? They will be up to a thousand times faster than anything written in Perl.
- Do you have indices on the tables you're reading? Have the tables you're reading been optimized?
- Do you have indicies on the tables you're writing to? If you disable them while writing, you can increase throughput 10-100x.
- With your multiple processes, are you hitting mutexes that are negating all benefits of SMP? For example, many RDBMSes will not let two processes update the same table at once, especially if the rows are in the same page of memory. Likewise, some will not let you insert into the same table at once, but some will. And, even crazier, some are configurable.
- Are you taking advantage of all the RDBMS features? For example, MySQL will let you insert up to 1000 records at once. Oracle won't, but has other features.
The overarching theme is Know your tools. It doesn't sound like you really understand them.
Being right, does not endow the right to be rude; politeness costs nothing.
Being unknowing, is not the same as being stupid.
Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.