Yes, I've profiled the code. The bottlenecks are in the I/O. No surprise there. The sql queries are right at the top, even before the file I/O.
The good news is that I don't have to - your bottlenecks are in the SQL reads and writes.
Actually, they're exclusively in the SQL reads. We don't have any SQL writes. If we did, I wouldn't have specified that all the configuration data was read only, now would I?
I will bet that you don't have good indicies, that your loads are updating indices every time, and that you can increase your throughput 100-fold if you had a DBA consultant with 10years experience come in for 2 weeks and audit your system.
Well, we have indexes on all our key fields, and we don't do any writes. We don't load data into tables, so it's unlikely that the nonexistant loads are updating the table indexes. It might be possible that a DBA with 10 years experience might give 100 fold improvement to our database, but I confess that I don't see how. I have neither the authority nor budget to hire a DBA, so it's a moot point.
I can get a 100 fold improvement by moving code out of the SQL tables, though. Profiling lookups on one of our table using the existing code gives 670 lookups/second using the (poorly written, IMHO) database solution, versus 56,818/s for a direct perl hash lookup. This is a 100 fold improvement: and it's one that I can implement.
The overarching theme is Know your tools. It doesn't sound like you really understand them. Does Informix have a secret "go fast" switch I can flip? Failing that, I'm not sure what else to try (but I'm certainly open to specific ideas.) That's why I went with my own system, having learned in school that relational databases are good for many reasons, but none of them are maximum speed.
The simplest approach is just to throw everything into memory in a perl hash-like structure. This works, but load time becomes a factor, and we hit the per-process memory limit. Changing this would require a recompile of the HP/UX kernel, which doesn't sound like an option.
So,I need to balance total memory use with lookup speed. Perl hashes on my system seem to expand at somewhere like 40:1 memory usage: I've seen a 7MB flat text file turn into nearly 100MB perl hash. Using some smarter data management, I can get that down to a 5MB binary file including indexes. Read from it using pure perl is still ten times faster than the SQL solution, and that's before I start to play with XS.
This solution seems do-able. Shared memory may not be the way to go, but do you see any other solutions? -- Ytrew
In reply to Re^2: How best to optimize shared data
by Anonymous Monk
in thread How best to optimize shared data
by Ytrew
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |