[Pacemaker] Stonith: How to avoid deathmatch cluster partitioning
Klaus Darilion
klaus.mailinglists at pernau.at
Wed May 15 12:37:53 UTC 2013
Hi!
I have a 2 nodes cluster: a simple test setup with a
ocf:heartbeat:IPaddr2 resource, using xen VMs and stonith:external/xen0.
Please see the complete config below.
Basically everything works fine, except in the case of broken corosync
communication between the nodes (simulated by shutting down the network
link used for corosync communication). In this case, both nodes almost
at the same time detect that the other node went offline 'unclean' and
shoot the other node in the head, causing a reboot of both nodes.
I know that the cluster network should be reliable and then this
scenario should not happen. But is there a solution to avoid a
deathmatch when the cluster communication for some reason is down, but
the stonith network still works?
For me the obvious solution would be to use different timeouts for
triggering the head-shot. I tried "startup-delay" as suggested in
http://www.gossamer-threads.com/lists/linuxha/pacemaker/80918 but still
both nodes trigger the head-shot immediately.
Do I use the parameter correctly (please see config below)?
Are there other possibilities to solve this problem?
As a workaround, is it possible to tweak the timeout parameters in
corosync.conf or should they always be identical?
Thanks
Klaus
node pace1
node pace2
primitive ip_service ocf:heartbeat:IPaddr2 \
params ip="10.10.0.69" nic="eth0" cidr_netmask="24"
iflabel="pace" \
op monitor interval="60s"
primitive st-pace1 stonith:external/xen0 \
params hostlist="pace1" dom0="xentest1" \
op start start-delay="15s" interval="0"
primitive st-pace2 stonith:external/xen0 \
params hostlist="pace2" dom0="xentest2"
location l-st-pace1 st-pace1 -inf: pace1
location l-st-pace2 st-pace2 -inf: pace2
property $id="cib-bootstrap-options" \
dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="true" \
no-quorum-policy="ignore"
More information about the Pacemaker
mailing list