[Pacemaker] split brain - after network recovery - resources can still be migrated

Sat Oct 25 22:35:49 UTC 2014

On Sat, 25 Oct 2014 17:30:07 -0400
Digimer <lists at alteeve.ca> wrote:

> On 25/10/14 05:09 PM, Vladimir wrote:
> > Hi,
> >
> > currently I'm testing a 2 node setup using ubuntu trusty.
> >
> > # The scenario:
> >
> > All communication links betwenn the 2 nodes are cut off. This
> > results in a split brain situation and both nodes take their
> > resources online.
> >
> > When the communication links get back, I see following behaviour:
> >
> > On drbd level the split brain is detected and the device is
> > disconnected on both nodes because of "after-sb-2pri disconnect" and
> > then it goes to StandAlone ConnectionState.
> >
> > I'm wondering why pacemaker does not let the resources fail.
> > It is still possible to migrate resources between the nodes although
> > they're in StandAlone ConnectionState. After a split brain that's
> > not what I want.
> >
> > Is this the expected behaviour? Is it possible to let the resources
> > fail after the network recovery to avoid fürther data corruption.
> >
> > (At the moment I can't use resource or node level fencing in my
> > setup.)
> >
> > Here the main part of my config:
> >
> > #> dpkg -l | awk '$2 ~ /^(pacem|coro|drbd|libqb)/{print $2,$3}'
> > corosync 2.3.3-1ubuntu1
> > drbd8-utils 2:8.4.4-1ubuntu1
> > libqb-dev 0.16.0.real-1ubuntu3
> > libqb0 0.16.0.real-1ubuntu3
> > pacemaker 1.1.10+git20130802-1ubuntu2.1
> > pacemaker-cli-utils 1.1.10+git20130802-1ubuntu2.1
> >
> > # pacemaker
> > primitive drbd-mysql ocf:linbit:drbd \
> > params drbd_resource="mysql" \
> > op monitor interval="29s" role="Master" \
> > op monitor interval="30s" role="Slave"
> >
> > ms ms-drbd-mysql drbd-mysql \
> > meta master-max="1" master-node-max="1" clone-max="2"
> > clone-node-max="1" notify="true"
> 
> Split-brains are prevented by using reliable fencing (aka stonith).
> You configure stonith in pacemaker (using IPMI/iRMC/iLO/etc, switched
> PDUs, etc). Then you configure DRBD to use the crm-fence-peer.sh
> fence-handler and you set the fencing policy to
> 'resource-and-stonith;'.
> 
> This way, if all links fail, both nodes block and call a fence. The 
> faster one fences (powers off) the slower, and then it begins
> recovery, assured that the peer is not doing the same.
> 
> Without stonith/fencing, then there is no defined behaviour. You will 
> get split-brains and that is that. Consider; Both nodes lose contact 
> with it's peer. Without fencing, both must assume the peer is dead
> and thus take over resources.

That split brains can occur in such a setup that's clear. But I would
expect pacemaker to stop the drbd resource when the link between the
cluster nodes is reestablished instead of continue running it.

> This is why stonith is required in clusters. Even with quorum, you
> can't assume anything about the state of the peer until it is fenced,
> so it would only give you a false sense of security.

Maybe I'll can use resource level fencing.