[Pacemaker] DRBD 2 node cluster and STONITH configuration help?required.

Wed Mar 24 22:41:32 UTC 2010

On Wed, Mar 24, 2010 at 07:59:26PM +0000, Mario Giammarco wrote:
> Andrew Beekhof <andrew at ...> writes:
> 
> > 
> > Have you seen:
> >    http://www.clusterlabs.org/doc/crm_fencing.html
> > I have been led to believe that STONITH
> > > will help prevent split brain situations, but the LINBIT instructions do not
> > > provide any guidance on how to conifgure STONITH in the pacemaker cluster.
> 
> Probably the 10 million dollar question is: does drbd really need stonith?
> 
> I am interested too....

DRBD per se does not.
Your data may or may not.

The main difference between a replicated and a shared solution here is,

if you do concurrent *uncoordinated* modifications
 * to a shared disk, you scramble your data.
 * to a DRBD, you get "diverging data sets".

so with DRBD, if you really lose all cluster communications,
and NOT STONITH, and ignore quorum loss etc.,
you can end up with both sides of the DRBD being Primary,
being consistent in themselves, but diverging.

Once you realise this, you get to the fun part of chosing
which data set you want to keep,
if you'd try to "manually merge" them (depending on type of data that
may even be possible, but not on the DRBD level),
or scratch both versions and restore from latest backup anyways.

It may be a plausible assumption that no (relevant) modifications
are done on an isolated system, though, unless you happen to
get client communication going to both systems without re-establishing
communication between those systems.
Which is still entirely possible, of course.

DRBD resource-level fencing can help
in variations of the following scenario:
 all good.
 replication link breaks,
 other cluster comm channels still available.
 [A]
 Primary keeps going for a while
 Primary goes down
 [B] former Secondary takes over WITH STALE DATA.

[A] at this point, the resource level fencing can use the other cluster
communication channels (via the cib, of via dopd) to persistently record
the "Outdated"ness of the then Secondary, so pacemaker would not even
attempt to promote it at [B], respectively, it would refuse to be
promoted (without applying brute --force).

Variations of that scenario include Secondary crash instead of
replication link loss, and some are more difficult to explain.
Some may also require to set "fencing resource-and-stonith" in
drbd.conf, even though no stonith is actually applied, just for the
side-effects of that setting.

resource-level fencing (alone) is NOT sufficient, if all (remaining)
cluster communication is lost at (virtually) the same time as the
replication link.

So if you need to protect you against that scenario,
you need to (also) configure STONITH.

Stonith alone does not help, either.

Above scenario again,
but instead of "Primary goes down", think
"remaining cluster comm breaks".
shoot out,
former Secondary wins.

But just because you can shoot someone
does not mean you have the bi^Wbetter data.

Sooo.
What do we do about those dollars now?

 ;-)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.