comment on

Version that I'm working on is 8.3.5 - no autocacuum, though I run it manually basically before (batch of) tests and now playing with vacuuming them from within Perl code. My test data is small - I test with 10.000 to 100.000 records so sizes are small.

Real system has ~5 new records per second + all the queue tables for each of those (and it's the queue tables that are being updated/deleted from) and the idea is that they are "small" or preferably as close to 0 records as possible. It's relatively recent 8.x version running on a dedicated server double Xeon something with 2 (or perhaps 4) GB of RAM. At most data that is kept there is for last 3 days - those queue tables are vacuumed every couple of hours (manually) since each of them gets ~ 500K records added and deleted each day. Main data table being partitioned per day (so ~ 500K records that are added and from time to time updated - per partition - 3 partitions) each partition deleted after 3 days - so no vacuum is running on them.

I know that full vacuum is locking the whole table - the fact that there are only two processes working with any given queue table (previous state process, and the one processing that queue/state) make it acceptable.

IMHO the problem is that DB's aren't primary meant to be queues where in one day you get 5 million records added/deleted (estimate for 10 queue tables). But at the same time you need transactions to be sure where each and every record is in the processing. OTOH all servers are having up-time of up to 3 years - so it might be OK to do everything in RAM and occasionally just update the DB. Another alternative I'm looking at is the DBM::Deep since it sales pitch is quite impressive:

A unique flat-file database module, written in pure perl. True multi-level hash/array support (unlike MLDBM, which is faked), hybrid OO / tie() interface, cross-platform FTPable files, ACID transactions, and is quite fast. Can handle millions of keys and unlimited levels without significant slow-down.

In this system everything is called by it's Key/ID (anyone mentioned hashes?), or in FIFO sequence (arrays?). With only some reports needing real SQL. Of course using DBM::Deep would give the PHP guys that are working on administration interface (web) quite a headache :) So I'm guessing I'll try to just shift those queues to DBD::Deep and see what happens.

Have you tried freelancing/outsourcing? Check out Scriptlance - I work there since 2003. For more info about Scriptlance and freelancing in general check out my home node.

In reply to Re^2: Working with large number of small constantly updated records by techcode
in thread Working with large number of small constantly updated records by techcode

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.