in reply to Re^11: Architecture design for full stack development.
in thread Architecture design for full stack development.

Yes and I am grateful to you for putting me right - I could have gone on thinking that the clustered index could be relied upon to persist rather than requiring scheduled cluster operations (e.g. a cron or autosys job) instead.

One world, one people

  • Comment on Re^12: Architecture design for full stack development.

Replies are listed 'Best First'.
Re^13: Architecture design for full stack development.
by erix (Prior) on Jun 25, 2017 at 08:43 UTC

    ... cron or autosys job

    In view of the 'billions of records' you mentioned earlier, in combination with a busy site:

    Keep in mind that a table is completely blocked for both read and write during a CLUSTER operation. This often simply means that it is not feasible to use CLUSTER (unless you can suffer the 'downtime').

    I'd say, just forget about primary key- (or index-) 'fragmentation' as a problem. It is not, really. It should not be an application level concern.

    I did a comparison between gapped-values versus contiguous-values tables (starting out with 1 billion rows (=34GB), deleting 90%, then comparing to same-size contiguous-numbered table), and it turns out that indeed, there is no difference in performance.

    $ ./testfrag.sh -- rowcount1 [1_000_000_000] -- rowcount2 [100000000] -- gap_size [10] -- running tests... fragmented: average (100 searches): 23.754 ms contiguous: average (100 searches): 26.553 ms

    (The difference seen here is coincidence -- after another CLUSTER it might be the other way round)

    The test was done on my old desktop, and above are cold-cache timings (cached, both go down to ~2.0 ms (again no difference)).