[Pacemaker] PE ignores monitor failure of stonith:external/rackpdu

Fri Oct 29 06:37:04 UTC 2010

Hi,

I wanted to check what happens when the monitor of a fencing agents
fails, thus I disconnected the PDU from network, reduced the monitor
interval and put debug statements on the fencing script.

here is the debug statements on the status code
status)
        if [ -z "$pduip" ]; then
            exit 1
        fi
        date >> /tmp/pdu.monitor
        if ping -w1 -c1 $pduip >/dev/null 2>&1; then
            exit 0
        else
            echo "failed" >> /tmp/pdu.monitor
            exit 1
        fi
        ;;

here is the debug output which states that monitor failed
[root at node-03 tmp]# cat pdu.monitor
Fri Oct 29 08:29:20 CEST 2010
Fri Oct 29 08:31:05 CEST 2010
failed
Fri Oct 29 08:32:50 CEST 2010
failed

but pacemaker thinks is fine
[root at node-03 tmp]# crm status|grep pdu
 pdu    (stonith:external/rackpdu):     Started node-03
[root at node-03 tmp]#

and here is the resource
primitive pdu stonith:external/rackpdu \
        params community="empisteftiko"
names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4"
oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.3" hostlist="AUTO"
pduip="192.168.100.100" stonith-timeout="30" \
        op monitor interval="1m" timeout="60s"

Is it the expected behaviour?

Cheers,
Pavlos