[ClusterLabs] How to cancel a fencing request?
Jehan-Guillaume de Rorthais
jgdr at dalibo.com
Tue Apr 10 02:48:55 EDT 2018
On Mon, 09 Apr 2018 17:59:26 -0500
Ken Gaillot <kgaillot at redhat.com> wrote:
> On Tue, 2018-04-10 at 00:02 +0200, Jehan-Guillaume de Rorthais wrote:
> > On Tue, 03 Apr 2018 17:35:43 -0500
> > Ken Gaillot <kgaillot at redhat.com> wrote:
> >
> > > On Tue, 2018-04-03 at 21:46 +0200, Klaus Wenninger wrote:
> > > > On 04/03/2018 05:43 PM, Ken Gaillot wrote:
> > > > > On Tue, 2018-04-03 at 07:36 +0200, Klaus Wenninger wrote:
> > > > > > On 04/02/2018 04:02 PM, Ken Gaillot wrote:
> > > > > > > On Mon, 2018-04-02 at 10:54 +0200, Jehan-Guillaume de
> > > > > > > Rorthais
> > > > > > > wrote:
> >
> > [...]
> > > > > >
> > > > > > -inf constraints like that should effectively prevent
> > > > > > stonith-actions from being executed on that nodes.
> > > > >
> > > > > It shouldn't ...
> > > > >
> > > > > Pacemaker respects target-role=Started/Stopped for controlling
> > > > > execution of fence devices, but location (or even whether the
> > > > > device is
> > > > > "running" at all) only affects monitors, not execution.
> > > > >
> > > > > > Though there are a few issues with location constraints
> > > > > > and stonith-devices.
> > > > > >
> > > > > > When stonithd brings up the devices from the cib it
> > > > > > runs the parts of pengine that fully evaluate these
> > > > > > constraints and it would disable the stonith-device
> > > > > > if the resource is unrunable on that node.
> > > > >
> > > > > That should be true only for target-role, not everything that
> > > > > affects
> > > > > runnability
> > > >
> > > > cib_device_update bails out via a removal of the device if
> > > > - role == stopped
> > > > - node not in allowed_nodes-list of stonith-resource
> > > > - weight is negative
> > > >
> > > > Wouldn't that include a -inf rule for a node?
> > >
> > > Well, I'll be ... I thought I understood what was going on there.
> > > :-)
> > > You're right.
> > >
> > > I've frequently seen it recommended to ban fence devices from their
> > > target when using one device per target. Perhaps it would be better
> > > to
> > > give a lower (but positive) score on the target compared to the
> > > other
> > > node(s), so it can be used when no other nodes are available. you
> > > could
> > > re-manage.
> >
> > Wait, you mean a fencing resource can be triggered from its own
> > target? Wat
> > happen then? Node suicide and all the cluster nodes are shutdown?
> >
> > Thanks,
>
> A node can fence itself, though it will be the cluster's last resort
> when no other node can. It doesn't necessarily imply all other nodes
> are shut down ...
Indeed, sorry I was clear enough: I was talking about a fencing race
situation.
> there may be other nodes up, but they are not allowed
> execute the relevant fence device for whatever reason.
In such situation, how other node can confirm the node fence itself without
confirmation?
> But of course there might be no other nodes up, in which case, yes, the
> cluster dies (the idea being that the node is known to be malfunctioning, so
> stop it from possibly corrupting data).
This make sense to me.
Thanks,
More information about the Users
mailing list