[Pacemaker] monitor on-fail=ignore not restarting when resource reported as stopped

Lars Marowsky-Bree lmb at suse.com
Fri Dec 6 13:44:53 EST 2013


On 2013-12-06T11:21:02, Patrick Hemmer <pacemaker at feystorm.net> wrote:

> > So where is the problem? If the script returns "ERROR" than pacemaker has to 
> > acct accordingly.
> If the script returns "ERROR" the `on-fail=ignore` should make it do
> nothing. Amazon's API failed, we need to just retry again later.
> If the script returns "STOPPED", this isn't an error. The script queried
> the resource, found it was stopped, and reported it as stopped.
> Pacemaker should act accordingly and start it back up.

For a resource that pacemaker expects to be started, it's an error if it
is found to be stopped. Pacemaker can't tell if it is really cleanly
stopped, or died, or ...

If you want Pacemaker to recover failed resources, do not set
on-fail="ignore". I still don't quite get why you set that when you
obviously don't want the associated behaviour?


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





More information about the Pacemaker mailing list