[Pacemaker] new doc about stonith/fencing

Peter Kruse pk at q-leap.de
Mon May 4 05:37:42 EDT 2009


Hi Dejan,

Dejan Muhamedagic wrote:
> As usual, constructive criticism/suggestions/etc are welcome.

Thanks for sharing.
Allow me to bring up a topic that to my point of view is important.
You have written:

> The lights-out devices (IBM RSA, HP iLO, Dell DRAC) are becoming increasingly popular
> and in future they may even become standard equipment of of-the-shelf computers.
> They are, however, inferior to UPS devices, because they share a power supply with their
> host (a cluster node). If a node stays without power, the device supposed to control it
> would be just as useless. Even though this is obvious to us, the cluster manager is not
> in the know and will try to fence the node in vain. This will continue forever because all
> other resource operations would wait for the fencing/stonith operation to succeed.

This is the same problem with PDUs as they share the same power supply with
the host as well.  Is there any intention to deal with this issue?  I'm
thinking of the powerfail algorithm:

If the PDUs becomes unavailable and shortly after the host is unavailable as
well, then assume the host is down and fenced successfully.

This would be true if the PDU (and with it the host) loses power.
At the moment it looks that stonith without such an algorithm is
a SPoF by design, because after a single failure (powerloss), the
cluster is not able to bring up the resources again.

Looking forward to your comments,

   Peter




More information about the Pacemaker mailing list