[Pacemaker] drbd under pacemaker - always get split brain

Thu Jul 12 11:12:59 EDT 2012

On Thu, Jul 12, 2012 at 04:23:51PM +0200, Nikola Ciprich wrote:
> Hello Lars,
> 
> thanks for Your reply..
> 
> > You "Problem" is this:
> > 
> > 	DRBD config:
> > 	       allow-two-primaries,
> > 	       but *NO* fencing policy,
> > 	       and *NO* fencing handler.
> > 
> > 	And, as if that was not bad enough already,
> > 	Pacemaker config:
> > 		no-quorum-policy="ignore" \
> > 		stonith-enabled="false"
> 
> yes, I've written it's just test cluster on virtual machines. therefore no fencing devices.
> 
> however I don't think it's the whole problem source, I've tried starting node2 much later
> after node1 (actually node1 has been running for about 1 day), and got right into same situation..
> pacemaker just doesn't wait long enough before the drbds can connect at all and seems to promote them both.
> it really seems to be regression to me, as this was always working well...

It is not.

Pacemaker may just be quicker to promote now,
or in your setup other things may have changed
which also changed the timing behaviour.

But what you are trying to do has always been broken,
and will always be broken.

> even though I've set no-quorum-policy to freeze, the problem returns as soon as cluster becomes quorate..
> I have all split-brain and fencing scripts in drbd disabled intentionaly so I had chance to investigate, otherwise
> one of the nodes always commited suicide but there should be no reason for split brain..

Right.

That's why "shooting" as in stonith is not good enough a fencing
mechanism in a drbd dual Primary cluster. You also need to tell the peer
that it is outdated, respectively must not become "Primary" or "Master"
until it synced up (or at least, *starts* to sync up).

You can do that using the crm-fence-peer.sh (it does not actually tell
DRBD that it is outdated, but it tells Pacemaker to not promote that
other node, which is even better, if the rest of the system is properly set up.

crm-fence-peer.sh alone is also not good enough in certain situations.
That's why you need both, the drbd "fence-peer" mechanism *and* stonith.

> 
> cheers!
> 
> nik
> 
> 
> 
> 
> > D'oh.
> > 
> > And then, well,
> > your nodes come up some minute+ after each other,
> > and Pacemaker and DRBD behave exactly as configured:
> > 
> > 
> > Jul 10 06:00:12 vmnci20 crmd: [3569]: info: do_state_transition: All 1 cluster nodes are eligible to run resources.
> > 
> > 
> > Note the *1* ...
> > 
> > So it starts:
> > Jul 10 06:00:12 vmnci20 pengine: [3568]: notice: LogActions: Start   drbd-sas0:0	(vmnci20)
> > 
> > But leaves:
> > Jul 10 06:00:12 vmnci20 pengine: [3568]: notice: LogActions: Leave   drbd-sas0:1	(Stopped)
> > as there is no peer node yet.
> > 
> > 
> > And on the next iteration, we still have only one node:
> > Jul 10 06:00:15 vmnci20 crmd: [3569]: info: do_state_transition: All 1 cluster nodes are eligible to run resources.
> > 
> > So we promote:
> > Jul 10 06:00:15 vmnci20 pengine: [3568]: notice: LogActions: Promote drbd-sas0:0	(Slave -> Master vmnci20)
> > 
> > 
> > And only some minute later, the peer node joins:
> > Jul 10 06:01:33 vmnci20 crmd: [3569]: info: do_state_transition: State transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ]
> > Jul 10 06:01:33 vmnci20 crmd: [3569]: info: do_state_transition: All 2 cluster nodes responded to the join offer.
> > 
> > So now we can start the peer:
> > 
> > Jul 10 06:01:33 vmnci20 pengine: [3568]: notice: LogActions: Leave   drbd-sas0:0	(Master vmnci20)
> > Jul 10 06:01:33 vmnci20 pengine: [3568]: notice: LogActions: Start   drbd-sas0:1	(vmnci21)
> > 
> > 
> > And it even is promoted right away:
> > Jul 10 06:01:36 vmnci20 pengine: [3568]: notice: LogActions: Promote drbd-sas0:1	(Slave -> Master vmnci21)
> > 
> > And within those 3 seconds, DRBD was not able to establish the connection yet.
> > 
> > 
> > You configured DRBD and Pacemaker to produce data divergence.
> > Not suprisingly, that is exactly what you get.
> > 
> > 
> > 
> > Fix your Problem.
> > See above; hint: fencing resource-and-stonith,
> > crm-fence-peer.sh + stonith_admin,
> > add stonith, maybe add a third node so you don't need to ignore quorum,
> > ...
> > 
> > And all will be well.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.