[Pacemaker] fencing to recover from failed resources

Bart Coninckx bart.coninckx at telenet.be
Wed Jan 12 21:52:14 UTC 2011


Hi,

I get a lot of fencing on my two node cluster with these messages:

Jan 12 22:20:34 xen2 pengine: [6633]: info: get_failcount: intranet1 has 
failed INFINITY times on xen1
Jan 12 22:20:34 xen2 pengine: [6633]: info: get_failcount: intranet1 has 
failed INFINITY times on xen1
Jan 12 22:20:34 xen2 pengine: [6633]: WARN: unpack_rsc_op: Processing failed 
op intranet1_monitor_60000 on xen1: unknown exec error (-2)
Jan 12 22:20:34 xen2 pengine: [6633]: info: get_failcount: intranet1 has 
failed INFINITY times on xen1
Jan 12 22:20:34 xen2 pengine: [6633]: WARN: unpack_rsc_op: Processing failed 
op intranet1_stop_0 on xen1: unknown exec error (-2)
Jan 12 22:20:34 xen2 pengine: [6633]: WARN: pe_fence_node: Node xen1 will be 
fenced to recover from resource failure(s)


My monitors are set to restart a resorce. What makes the PE decide to fence 
the node in stead of first trying to restart the resource as the monitor 
operation is configured to do?

Thank you!

Bart




More information about the Pacemaker mailing list