[Pacemaker] fencing with multiple node cluster
Digimer
lists at alteeve.ca
Tue Oct 28 13:51:02 UTC 2014
On 28/10/14 05:59 AM, philipp.achmueller at arz.at wrote:
> hi,
>
> any recommendation/documentation for a reliable fencing implementation
> on a multi-node cluster (4 or 6 nodes on 2 site).
> i think of implementing multiple node-fencing devices for each host to
> stonith remaining nodes on other site?
>
> thank you!
> Philipp
Multi-site clustering is very hard to do well because of fencing issues.
How do you distinguish a site failure from severed links? Given that a
failed fence action can not be assumed to be a success, then the only
safe option is to block until a human intervenes. This makes your
cluster as reliable as your WAN between the sites, which is too say, not
very reliable. In any case, the destruction of a site will require
manual failover, which can be complicated if insufficient nodes remain
to form quorum.
Generally, I'd recommend to different clusters, one per site, with
manual/service-level failover in the case of a disaster.
In any case; A good fencing setup should have two fence methods.
Personally, I always use IPMI as a primary fence method (routed through
one switch) and a pair of switched PDUs as backup (via a backup switch).
This way, when IPMI is available, a confirmed fence is 100% certain to
be good. However, if the node is totally disabled/destroyed, IPMI will
be lost and the cluster will switch to the switched PDUs, cutting the
power outlets feeding the node.
I've got a block diagram of how I do this:
https://alteeve.ca/w/AN!Cluster_Tutorial_2#A_Map.21
It's trivial to scale the idea up to multiple node clusters.
Cheers
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
More information about the Pacemaker
mailing list