in reply to Re^3: 'flock' with multiple users
in thread 'flock' with multiple users
The scenario is that I am building an HA postgres cluster with three nodes plus DR nodes. Postgres can be configured to replicate but robust HA needs extra work (in Perl seemed best) to check the status of all nodes in the cluster and if the current node has the wrong role, to either promote it to master or demote it to standby.
But you don't want postgres to start at machine bootup in this case because you want to check the correct role first. So obviously flock rather than DB is essential - why lock at all? Because otherwise if the master goes down, all the standbys will try to assume master. So the failover program (guess what, I called it failover.pl) has to lock a file (there's a shared volume used for backup, all the nodes have access to) before running its cycle. Then if it detects no masters and promotes its node to master, the failover running on any other node, will be locked out until the count of masters goes from 0 to 1 and so the possibility of two nodes seeing no master and simultaneously promoting two masters is avoided. But using the database is not feasible in this scenario because failover.pl only does anything when the postgres master node is down.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: 'flock' with multiple users
by Corion (Patriarch) on Jul 30, 2020 at 13:28 UTC |