[Pacemaker] Question on ILO stonith resource config and restarting

Dejan Muhamedagic dejanmm at fastmail.fm
Wed Nov 5 10:46:47 EST 2008


On Wed, Nov 05, 2008 at 08:56:22AM -0500, Aaron Bush wrote:
> > 
> > BTW, did you try to test your ilo device with the stonith
> > program. Use -d to get debugging output.
> > 
> 
> I did not try it via the stonithd -d.

No, I meant stonith -d ... That's an external program, nothing to
do with your cluster configuration.

> I was just tinkering with the
> actual resource Python script (after setting the appropriate environment
> variables).
> 
> When the LAN connection is up and available the script works well.  When
> the connection is down it also works well and a timeout is thrown.
> 
> 
> > I'd prefer to have the upper layer (stonithd) timeout. Why do
> > you think that this would help?
> 
> That is fine.  I was just taking a stab at it and hoping to invoke a
> discussion that the timeout should exist in the resource script.  Since
> the upper layer is catching it that is good and the safe place for a
> catch all; preventing underlying script errors/bugs from hanging the
> cluster.

The problem of handling timeouts within resources is that they
don't know the configured timeout values. Hence, it's better that
they just keep trying until the user configured timeout occurs.

> > > I am trying to find out what the expected behavior should be for a
> > > timeout on a start or monitor command.
> > 
> > A timeout on start is actually a timeout on monitor. Every
> > stonith start includes a monitor operation. Otherwise, start
> > should've been named "enable" for stonith resources.
> > 
> 
> Is it OK and expected for a Stonith resource which has timed-out to go
> into a state of not able to be run on any node w/o user intervention?

No, I shouldn't think so. Is it failcount related? There's also a
thing called start-failures-fatal or similar in the cluster
properties. Set that to false. Still, if monitor fails, the
resource should restart or move elsewhere, depending on the
stickiness settings.

Thanks,

Dejan


> Thanks,
> -ab
> 
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at clusterlabs.org
> http://list.clusterlabs.org/mailman/listinfo/pacemaker




More information about the Pacemaker mailing list