[ClusterLabs] reproducible split brain

Christopher Harvey cwh at eml.cc
Wed Mar 16 20:59:54 CET 2016


I am able to create a split brain situation in corosync 1.1.13 using
iptables in a 3 node cluster.

I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5

All nodes are operational and form a 3 node cluster with all nodes are
members of that ring.
vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
so far so good.

running the following on vmr-132-4 drops all incoming (but not outgoing)
packets from vmr-132-3:
# iptables -I INPUT -s 192.168.132.3 -j DROP
# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
DROP       all  --  192.168.132.3        anywhere

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ]
vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ]

vmr-132-3 thinks everything is normal and continues to provide service,
vmr-132-4 and 5 form a new ring, achieve quorum and provide the same
service. Splitting the link between 3 and 4 in both directions isolates
vmr 3 from the rest of the cluster and everything fails over normally,
so only a unidirectional failure causes problems.

I don't have stonith enabled right now, and looking over the
pacemaker.log file closely to see if 4 and 5 would normally have fenced
3, but I didn't see any fencing or stonith logs.

Would stonith solve this problem, or does this look like a bug?

Thanks,
Chris



More information about the Users mailing list