[Pacemaker] Question on ILO stonith resource config and restarting
Dejan Muhamedagic
dejanmm at fastmail.fm
Wed Nov 5 15:46:47 UTC 2008
On Wed, Nov 05, 2008 at 08:56:22AM -0500, Aaron Bush wrote:
> >
> > BTW, did you try to test your ilo device with the stonith
> > program. Use -d to get debugging output.
> >
>
> I did not try it via the stonithd -d.
No, I meant stonith -d ... That's an external program, nothing to
do with your cluster configuration.
> I was just tinkering with the
> actual resource Python script (after setting the appropriate environment
> variables).
>
> When the LAN connection is up and available the script works well. When
> the connection is down it also works well and a timeout is thrown.
>
>
> > I'd prefer to have the upper layer (stonithd) timeout. Why do
> > you think that this would help?
>
> That is fine. I was just taking a stab at it and hoping to invoke a
> discussion that the timeout should exist in the resource script. Since
> the upper layer is catching it that is good and the safe place for a
> catch all; preventing underlying script errors/bugs from hanging the
> cluster.
The problem of handling timeouts within resources is that they
don't know the configured timeout values. Hence, it's better that
they just keep trying until the user configured timeout occurs.
> > > I am trying to find out what the expected behavior should be for a
> > > timeout on a start or monitor command.
> >
> > A timeout on start is actually a timeout on monitor. Every
> > stonith start includes a monitor operation. Otherwise, start
> > should've been named "enable" for stonith resources.
> >
>
> Is it OK and expected for a Stonith resource which has timed-out to go
> into a state of not able to be run on any node w/o user intervention?
No, I shouldn't think so. Is it failcount related? There's also a
thing called start-failures-fatal or similar in the cluster
properties. Set that to false. Still, if monitor fails, the
resource should restart or move elsewhere, depending on the
stickiness settings.
Thanks,
Dejan
> Thanks,
> -ab
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker
More information about the Pacemaker
mailing list