[Pacemaker] monitor on-fail=ignore not restarting when resource reported as stopped

Patrick Hemmer pacemaker at feystorm.net
Fri Dec 6 15:58:19 EST 2013


 


------------------------------------------------------------------------
*From: *Lars Marowsky-Bree <lmb at suse.com>
*Sent: * 2013-12-06 13:44:53 E
*To: *The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org>
*Subject: *Re: [Pacemaker] monitor on-fail=ignore not restarting when
resource reported as stopped

> On 2013-12-06T11:21:02, Patrick Hemmer <pacemaker at feystorm.net> wrote:
>
>>> So where is the problem? If the script returns "ERROR" than pacemaker has to 
>>> acct accordingly.
>> If the script returns "ERROR" the `on-fail=ignore` should make it do
>> nothing. Amazon's API failed, we need to just retry again later.
>> If the script returns "STOPPED", this isn't an error. The script queried
>> the resource, found it was stopped, and reported it as stopped.
>> Pacemaker should act accordingly and start it back up.
> For a resource that pacemaker expects to be started, it's an error if it
> is found to be stopped. Pacemaker can't tell if it is really cleanly
> stopped, or died, or ...
>
> If you want Pacemaker to recover failed resources, do not set
> on-fail="ignore". I still don't quite get why you set that when you
> obviously don't want the associated behaviour?
Then let me ask this, what is the point of having $OCF_ERR_GENERIC and
$OCF_NOT_RUNNING if they both behave the same?

>
> Regards,
>     Lars
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131206/bcef4bd8/attachment-0003.html>


More information about the Pacemaker mailing list