... cron or autosys job
In view of the 'billions of records' you mentioned earlier, in combination with a busy site:
Keep in mind that a table is completely blocked for both read and write during a CLUSTER operation. This often simply means that it is not feasible to use CLUSTER (unless you can suffer the 'downtime').
I'd say, just forget about primary key- (or index-) 'fragmentation' as a problem. It is not, really. It should not be an application level concern.
I did a comparison between gapped-values versus contiguous-values tables (starting out with 1 billion rows (=34GB), deleting 90%, then comparing to same-size contiguous-numbered table), and it turns out that indeed, there is no difference in performance.
$ ./testfrag.sh
-- rowcount1 [1_000_000_000]
-- rowcount2 [100000000]
-- gap_size [10]
-- running tests...
fragmented: average (100 searches): 23.754 ms
contiguous: average (100 searches): 26.553 ms
(The difference seen here is coincidence -- after another CLUSTER it might be the other way round)
The test was done on my old desktop, and above are cold-cache timings (cached, both go down to ~2.0 ms (again no difference)).
|