[Pacemaker] Enable remote monitoring

Gao,Yan ygao at suse.com
Sun Dec 16 10:48:04 EST 2012


On 12/12/12 17:51, Lars Marowsky-Bree wrote:
> On 2012-12-11T12:53:39, David Vossel <dvossel at redhat.com> wrote:
> 
> Excellent progress!
> 
> Just one aspect caught my eye:
> 
>>> - on-fail defaults "restart-container" for most actions,
>>>
>>>   except for stop op (Not sure what it means if a stop fails. A
>>>   nagios
>>> daemon cannot be terminated? Should it always return success?) ,
>>
>> A nagios "stop" action should always return success.  The nagio's agent doesn't even need a stop function, the lrmd can know to treat  a "stop" as a (no-op for stop) + (cancel all recurring actions).  In this case if the nagios agent doesn't stop successfully,  it is because of an lrmd failure which should result in a fencing action i'd imagine.
> 
> That's something that, IMHO, shouldn't be handled by the container
> abstraction, but - like you say - by the LRM/class code.
> 
> I think on-fail="restart-container" makes sense even for stop. If
> "stop" can't technically fail for a given class, even better. But it
> could mean that we actually need to stop some monitoring daemon or
> whatever.
> 
> The other logic might be to set it to "ignore", which would also work
> for me (even if a bit less obviously).
Makes sense. Now on-fail for stop op defaults to "restart-container":

https://github.com/gao-yan/pacemaker/commits/container

And yes, internally, failed stop operations are ignored so that the
container can be restarted, given that there's a mandatory order.

> 
> But really I'd not want to make "oh let's just skip stop for contained
> resources" here ;-)
> 
>>> - Failures of resources count against container's
>> What happens if someone wants to clear the container's failcount? Do we need to add some logic to go in and clear all the child resource's failures as well to make this happen correctly?
> 
> That appears to make sense.
Will do that.

Regards,
  Gao,Yan
-- 
Gao,Yan <ygao at suse.com>
Software Engineer
China Server Team, SUSE.




More information about the Pacemaker mailing list