[Pacemaker] behaviour of failure-timeout
Kashif Jawed Siddiqui
kashifjs at huawei.com
Fri Sep 14 01:24:08 UTC 2012
Hi,
>>I could not find any detailed explanation in the doc, how
>>"failure-timeout" behaves, can someone clarify that?
failure-timout will clear the failcount if it has increased.(i.e the resources have failed)
The cluster is primarily event driven but can have effect based on time. This timout can be specified as cluster property known as cluster-recheck-interval. The default is 15 mins
>>My rough understanding so far is, that after a failcount is increased,
>>pacemaker "waits for the failure-timeout" to expire and then checks if
>>the failure condition is still on. If not, it will reset the failcount
>>on that node. Now
Yes
>>- How does pacemaker check that, is it using a monitor operation?
As mentioned above, depends on any events or cluster-recheck-interval.
>>- are there re-checks at later times
Same as above
>>- Are checks only run on the node where the failcount was increased or
>>on all
No. It is run on all nodes
Regards,
Kashif Jawed Siddiqui
>>Cheers!
>>Mario
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list