[Pacemaker] on_fail

Dominik Klein dk at in-telegence.net
Fri Feb 13 06:31:49 EST 2009


Romi Verma wrote:
>> setting on_fail=fence for a monitor op will cause the cluster to shoot the
>> node immediately instead of trying to stop the resource and recover it
>> without fencing.
>>
> 
> u said "instead of trying to stop the resource and recover it without
> fencing."
> 
>  do you mean if on_fail is not fence then after monitor op failure , it will
> try starting  the resource on the same node  again 

Default action upon monitor failure is stop and then start (but read on
below)

> ,  before declaring the
> node errant.   if it is true can we set this retries count.

There's a meta attribute "migration-threshold", which controls after how
many monitor failures the resource will be failed over to another node.
Say you set this to 2, then after the first failure, the cluster will
try to restart the resource on the same node. After the second failure,
you will see a failover.

> and if on_fail=fence then it will never retry to start the resource on the
> same node , it will just failover the resource to another node and shoot the
> errant node immediately.

It will immediately fence (turn of or reboot depending on your
configuration) the node on which the operation failed.

Regards
Dominik




More information about the Pacemaker mailing list