[ClusterLabs] Failure of preferred node in a 2 node cluster

Andrei Borzenkov arvidjaar at gmail.com
Sun Apr 29 08:22:07 EDT 2018


29.04.2018 04:19, Wei Shan пишет:
> Hi,
> 
> I'm using Redhat Cluster Suite 7with watchdog timer based fence agent. I
> understand this is a really bad setup but this is what the end-user wants.
> 
> ATB => auto_tie_breaker
> 
> "When the auto_tie_breaker is used in even-number member clusters, then the
> failure of the partition containing the auto_tie_breaker_node (by default
> the node with lowest ID) will cause other partition to become inquorate and
> it will self-fence. In 2-node clusters with auto_tie_breaker this means
> that failure of node favoured by auto_tie_breaker_node (typically nodeid 1)
> will result in reboot of other node (typically nodeid 2) that detects the
> inquorate state. If this is undesirable then corosync-qdevice can be used
> instead of the auto_tie_breaker to provide additional vote to quorum making
> behaviour closer to odd-number member clusters."
> 

That's not what upstream corosync manual pages says. Corosync itself
won't initiate self-fencing, it just marks node as being out of quorum.
What happens later depends on higher layers like pacemaker. Pacemaker
can be configured to commit suicide, but can also be configured to
ignore quorum completely. I am not familiar with details how RHCS
behaves by default.

I just tested on vanilla corosync+pacemaker (openSUSE Tumbleweed) and
nothing happens when I kill lowest node in two-node configuration.

If your cluster nodes are configured to commit suicide, what happens
after reboot depends on at least wait_for_all corosync setting. With
wait_for_all=1 (default in two_node) and without a) ignoring quorum
state and b) having fencing resource pacemaker on your node will wait
indefinitely after reboot because partner is not available.



More information about the Users mailing list