[Pacemaker] Antwort: Re: fencing with multiple node cluster

Wed Oct 29 06:37:48 EDT 2014

Hi,

On Tue, Oct 28, 2014 at 05:32:09PM +0100, philipp.achmueller at arz.at wrote:
> hi,
> 
> 
> 
> 
> Von:    Dejan Muhamedagic <dejanmm at fastmail.fm>
> An:     The Pacemaker cluster resource manager 
> <pacemaker at oss.clusterlabs.org>
> Datum:  28.10.2014 16:45
> Betreff:        Re: [Pacemaker] fencing with multiple node cluster
> 
> >
> >
> >Hi,
> >
> >On Tue, Oct 28, 2014 at 09:51:02AM -0400, Digimer wrote:
> >>> On 28/10/14 05:59 AM, philipp.achmueller at arz.at wrote:
> >>> hi,
> >>>
> >>> any recommendation/documentation for a reliable fencing implementation
> >>> on a multi-node cluster (4 or 6 nodes on 2 site).
> >>> i think of implementing multiple node-fencing devices for each host to
> >>> stonith remaining nodes on other site?
> >>>
> >>> thank you!
> >>> Philipp
> >>
> >> Multi-site clustering is very hard to do well because of fencing 
> issues. 
> >> How do you distinguish a site failure from severed links?
> >
> >Indeed. There's a booth server managing the tickets in
> >pacemaker, which uses arbitrators to resolve ties. booth source
> >is available at github.com and packaged for several
> >distributions at OBS
> >(
> http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/)
> >It's also supported in the newly released SLE12.
> >
> >Thanks,
> >
> >Dejan
> >
> hi,
> 
> @Digimer. thank you for explaination, but manual failover between sites 
> isn't what i'm looking for.
> 
> @Dejan. Yes, i already tried a cluster(SLES11SP3) with booth setup. i used 
> documentation from sleha11 SP3. 

Good, so you're already familiar with the concept :)

> but i'm afraid it is unclear for me how "fencing" with booth exactly works 
> in case of some failures (loss-policy=fence). documentation says something 
> like: ...to speed up recovery process nodes get fenced... do i need 
> classic node-fencing(IPMI) when i configure booth setup? may you have some 
> more information about that?

It's just "normal" fencing, i.e. the one that happens _within_ a
site. The "loss-policy" refers to the event of node losing the
ticket. If that happens and the loss-policy is set to "fence",
then the node is going to be fenced. The stonith operation is
executed by another node in the same site.

> For correct setup, the arbitrator needs an adequate 3th location. site A 
> and site B need separate connection to site C, otherwise some scenarios 
> will fail.
> any possibilities to get this running with 2 sites?

A 3th member in the party is absolutely necessary. In 2-node
clusters the missing tie-breaker is effectively replaced by
fencing. Since fencing is rather difficult to execute from a
faraway location, then we need an arbitrator. But you need a
rather small machine for the arbitrator.

One (crazy) idea for such remote fencing could be to equip hosts
with GSM (or whatever the current technology for mobile
telephony) devices and then to send the fencing commands via SMS.
I suppose that such a fancy thing doesn't exist, so it would
require some programming.

Thanks,

Dejan

> thank you!
> 
> 
> >> Given that a 
> >> failed fence action can not be assumed to be a success, then the only 
> >> safe option is to block until a human intervenes. This makes your 
> >> cluster as reliable as your WAN between the sites, which is too say, 
> not 
> >> very reliable. In any case, the destruction of a site will require 
> >> manual failover, which can be complicated if insufficient nodes remain 
> >> to form quorum.
> >>
> >> Generally, I'd recommend to different clusters, one per site, with 
> >> manual/service-level failover in the case of a disaster.
> >>
> >> In any case; A good fencing setup should have two fence methods. 
> >> Personally, I always use IPMI as a primary fence method (routed through 
>  
> >> one switch) and a pair of switched PDUs as backup (via a backup 
> switch). 
> >> This way, when IPMI is available, a confirmed fence is 100% certain to 
> >> be good. However, if the node is totally disabled/destroyed, IPMI will 
> >> be lost and the cluster will switch to the switched PDUs, cutting the 
> >> power outlets feeding the node.
> >>
> >> I've got a block diagram of how I do this:
> >>
> >> https://alteeve.ca/w/AN!Cluster_Tutorial_2#A_Map.21
> >>
> >> It's trivial to scale the idea up to multiple node clusters.
> >>
> >> Cheers
> >>
> >> -- 
> >> Digimer
> >> Papers and Projects: https://alteeve.ca/w/
> >> What if the cure for cancer is trapped in the mind of a person without 
> >> access to education?
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >
> >_______________________________________________
> >Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >Project Home: http://www.clusterlabs.org
> >Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >Bugs: http://bugs.clusterlabs.org
> 
> 

> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org