[Pacemaker] Stonith: How to avoid deathmatch cluster partitioning
Lars Marowsky-Bree
lmb at suse.com
Thu May 16 09:01:18 UTC 2013
On 2013-05-15T22:55:43, Andreas Kurz <andreas at hastexo.com> wrote:
> start-delay is an option of the monitor operation ... in fact means
> "don't trust that start was successfull, wait for the initial monitor
> some more time"
It can be used on start here though to avoid exactly this situation; and
it works fine for that, effectively being equivalent to the "delay"
option on stonith (since the start always precedes the fence).
> The problem is, this would only make sense for one single stonith
> resource that can fence more nodes. In case of a split-brain that would
> delay the start on that node where the stonith resource was not running
> before and gives that node a "penalty".
Sure. In a split-brain scenario, one side will receive a penalty, that's
the whole point of this exercise. In particular for the external/sbd
agent.
Or by grouping all fencing resources to always run on one node; if you
don't have access to RHT fence agents, for example.
external/sbd also has code to avoid a death-match cycle in case of
persistent split-brain scenarios now; after a reboot, the node that was
fenced will not join unless the fence is cleared first.
(The RHT world calls that "unfence", I believe.)
That should be a win for the fence_sbd that I hope to get around to
sometime in the next few months, too ;-)
> In your example with two stonith resources running all the time,
> Digimer's suggestion is a good idea: use one of the redhat fencing
> agents, most of them have some sort of "stonith-delay" parameter that
> you can use with one instance.
It'd make sense to have logic for this embedded at a higher level,
somehow; the problem is all too common.
Of course, it is most relevant in scenarios where "split brain" is a
significantly higher probability than "node down". Which is true for
most test scenarios (admins love yanking cables), but in practice, it's
mostly truly the node down.
Regards,
Lars
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
More information about the Pacemaker
mailing list