[Pacemaker] Two node cluster and no hardware device for stonith.

Wed Feb 11 05:30:57 EST 2015

On Wed, Feb 11, 2015 at 09:10:45AM +0300, Andrei Borzenkov wrote:
> В Tue, 10 Feb 2015 15:58:57 +0100
> Dejan Muhamedagic <dejanmm at fastmail.fm> пишет:
> 
> > On Mon, Feb 09, 2015 at 04:41:19PM +0100, Lars Ellenberg wrote:
> > > On Fri, Feb 06, 2015 at 04:15:44PM +0100, Dejan Muhamedagic wrote:
> > > > Hi,
> > > > 
> > > > On Thu, Feb 05, 2015 at 09:18:50AM +0100, Digimer wrote:
> > > > > That is the problem that makes geo-clustering very hard to nearly
> > > > > impossible. You can look at the Booth option for pacemaker, but that
> > > > > requires two (or more) full clusters, plus an arbitrator 3rd
> > > > 
> > > > A full cluster can consist of one node only. Hence, it is
> > > > possible to have a kind of stretch two-node [multi-site] cluster
> > > > based on tickets and managed by booth.
> > > 
> > > In theory.
> > > 
> > > In practice, we rely on "proper behaviour" of "the other site",
> > > in case a ticket is revoked, or cannot be renewed.
> > > 
> > > Relying on a single node for "proper behaviour" does not inspire
> > > as much confidence as relying on a multi-node HA-cluster at each site,
> > > which we can expect to ensure internal fencing.
> > > 
> > > With reliable hardware watchdogs, it still should be ok to do
> > > "stretched two node HA clusters" in a reliable way.
> > > 
> > > Be generous with timeouts.
> > 
> > As always.
> > 
> > > And document which failure modes you expect to handle,
> > > and how to deal with the worst-case scenarios if you end up with some
> > > failure case that you are not equipped to handle properly.
> > > 
> > > There are deployments which favor
> > > "rather online with _potential_ split brain" over
> > > "rather offline just in case".
> > 
> > There's an arbitrator which should help in case of split brain.
> > 
> 
> You can never really differentiate between site down and site cut off
> due to (network) infrastructure outage. Arbitrator can mitigate split
> brain only to the extent you trust your network. You still have to take
> decision what you value more - data availability or data consistency.

Right, that's why I mentioned ticket loss policy. If booth drops
the ticket, pacemaker would fence the node (if loss-policy=fence).
Booth guarantees that no two sites will hold the ticket at the
same time. Of course, you have to trust booth to function
properly, but I guess that's a different story.

Thanks,

Dejan

> Long distance clusters are really for disaster recovery. It is
> convenient to have a single button that starts up all resources in
> controlled manner, but someone really need to decide to push this
> button.
> 
> > > Document this, print it out on paper,
> > > 
> > >    "I am aware that this may lead to lost transactions,
> > >    data divergence, data corruption, or data loss.
> > >    I am personally willing to take the blame,
> > >    and live with the consequences."
> > > 
> > > Have some "boss" sign that ^^^
> > > in the real world using a real pen.
> > 
> > Well, of course running such a "stretch" cluster would be
> > rather different from a "normal" one.
> > 
> > The essential thing is that there's no fencing, unless configured
> > as a dead-man switch for the ticket. Given that booth has a
> > "sanity" program hook, maybe that could be utilized to verify if
> > this side of the cluster is healthy enough.
> > 
> > Thanks,
> > 
> > Dejan
> > 
> > > 	Lars
> > > 
> > > -- 
> > > : Lars Ellenberg
> > > : http://www.LINBIT.com | Your Way to High Availability
> > > : DRBD, Linux-HA  and  Pacemaker support and consulting
> > > 
> > > DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> > > 
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > 
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>