[Pacemaker] fencing with multiple node cluster

Tue Oct 28 13:51:02 UTC 2014

On 28/10/14 05:59 AM, philipp.achmueller at arz.at wrote:
> hi,
>
> any recommendation/documentation for a reliable fencing implementation
> on a multi-node cluster (4 or 6 nodes on 2 site).
> i think of implementing multiple node-fencing devices for each host to
> stonith remaining nodes on other site?
>
> thank you!
> Philipp

Multi-site clustering is very hard to do well because of fencing issues. 
How do you distinguish a site failure from severed links? Given that a 
failed fence action can not be assumed to be a success, then the only 
safe option is to block until a human intervenes. This makes your 
cluster as reliable as your WAN between the sites, which is too say, not 
very reliable. In any case, the destruction of a site will require 
manual failover, which can be complicated if insufficient nodes remain 
to form quorum.

Generally, I'd recommend to different clusters, one per site, with 
manual/service-level failover in the case of a disaster.

In any case; A good fencing setup should have two fence methods. 
Personally, I always use IPMI as a primary fence method (routed through 
one switch) and a pair of switched PDUs as backup (via a backup switch). 
This way, when IPMI is available, a confirmed fence is 100% certain to 
be good. However, if the node is totally disabled/destroyed, IPMI will 
be lost and the cluster will switch to the switched PDUs, cutting the 
power outlets feeding the node.

I've got a block diagram of how I do this:

https://alteeve.ca/w/AN!Cluster_Tutorial_2#A_Map.21

It's trivial to scale the idea up to multiple node clusters.

Cheers

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?