Re^12: Architecture design for full stack development.

Replies are listed 'Best First'.
Re^13: Architecture design for full stack development. by erix (Prior) on Jun 25, 2017 at 08:43 UTC
... cron or autosys job In view of the 'billions of records' you mentioned earlier, in combination with a busy site: Keep in mind that a table is completely blocked for both read and write during a CLUSTER operation. This often simply means that it is not feasible to use CLUSTER (unless you can suffer the 'downtime'). I'd say, just forget about primary key- (or index-) 'fragmentation' as a problem. It is not, really. It should not be an application level concern. I did a comparison between gapped-values versus contiguous-values tables (starting out with 1 billion rows (=34GB), deleting 90%, then comparing to same-size contiguous-numbered table), and it turns out that indeed, there is no difference in performance. `$ ./testfrag.sh -- rowcount1 [1_000_000_000] -- rowcount2 [100000000] -- gap_size [10] -- running tests... fragmented: average (100 searches): 23.754 ms contiguous: average (100 searches): 26.553 ms` [download] (The difference seen here is coincidence -- after another CLUSTER it might be the other way round) The test was done on my old desktop, and above are cold-cache timings (cached, both go down to ~2.0 ms (again no difference)).	[reply] [d/l]

Replies are listed 'Best First'.

Re^13: Architecture design for full stack development.
by erix (Prior) on Jun 25, 2017 at 08:43 UTC

... cron or autosys job

In view of the 'billions of records' you mentioned earlier, in combination with a busy site:

Keep in mind that a table is completely blocked for both read and write during a CLUSTER operation. This often simply means that it is not feasible to use CLUSTER (unless you can suffer the 'downtime').

I'd say, just forget about primary key- (or index-) 'fragmentation' as a problem. It is not, really. It should not be an application level concern.

I did a comparison between gapped-values versus contiguous-values tables (starting out with 1 billion rows (=34GB), deleting 90%, then comparing to same-size contiguous-numbered table), and it turns out that indeed, there is no difference in performance.

$ ./testfrag.sh
-- rowcount1  [1_000_000_000]
-- rowcount2  [100000000]
-- gap_size   [10]
-- running tests...
fragmented: average (100 searches):      23.754 ms
contiguous: average (100 searches):      26.553 ms
[download]

(The difference seen here is coincidence -- after another CLUSTER it might be the other way round)

The test was done on my old desktop, and above are cold-cache timings (cached, both go down to ~2.0 ms (again no difference)).

[reply]
[d/l]