[Pacemaker] PE ignores monitor failure of stonith:external/rackpdu
Pavlos Parissis
pavlos.parissis at gmail.com
Tue Nov 2 12:09:02 UTC 2010
On 2 November 2010 13:02, Dejan Muhamedagic <dejanmm at fastmail.fm> wrote:
[...snip...]
>
> > > Definitely not. If you do the monitor action from the command
> > > line does that also return the unexpected exit code:
> > >
> >
> > from the code I pasted you can see it returned 1.
>
> There is a difference. stonith-ng (stonithd) is a daemon that
> runs a perl script (fencing_legacy) which invokes stonith which
> then invokes the plugin. A problem can occur in any of these
> components. It's important to find out where.
>
> > > # stonith -t external/rackpdu community="empisteftiko"
> > > names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4" ... -lS
> > >
> > > Which pacemaker release do you run? I couldn't reproduce this
> > > with a recent Pacemaker.
> > >
> >
> > that it was on 1.1.3 and now I run 1.0.9.
> > Do you want me to run the test on 1.0.9?
>
> Yes, please. 1.0.9 is still running the old, and well tested,
> stonithd, so the result could be different.
>
>
I have the pdu off because it stopped working anymore! As a result the
resource is stopped.
But I did the test I see that even rackpdu returns 1 on status stonithd
reports 256
here is running stonith, remember pdu is off.
[root at node-01 ~]# stonith -d -t external/rackpdu
hostlist="node-01,node-02,node-03" pduip="192.168.100.100"
community="empisteftiko" names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4" -l
** (process:8115): DEBUG: NewPILPluginUniv(0x8f690c8)
** (process:8115): DEBUG: PILS: Plugin path =
/usr/lib/stonith/plugins:/usr/lib/heartbeat/plugins
** (process:8115): DEBUG: NewPILInterfaceUniv(0x8f69768)
** (process:8115): DEBUG: NewPILPlugintype(0x8f69a28)
** (process:8115): DEBUG: NewPILPlugin(0x8f69a40)
** (process:8115): DEBUG: NewPILInterface(0x8f69b50)
** (process:8115): DEBUG:
NewPILInterface(0x8f69b50:InterfaceMgr/InterfaceMgr)*** user_data: 0x0
*******
** (process:8115): DEBUG:
InterfaceManager_plugin_init(0x8f69b50/InterfaceMgr)
** (process:8115): DEBUG: Registering Implementation manager for Interface
type 'InterfaceMgr'
** (process:8115): DEBUG: PILS: Looking for InterfaceMgr/generic =>
[/usr/lib/stonith/plugins/InterfaceMgr/generic.so]
** (process:8115): DEBUG: Plugin file
/usr/lib/stonith/plugins/InterfaceMgr/generic.so does not exist
** (process:8115): DEBUG: PILS: Looking for InterfaceMgr/generic =>
[/usr/lib/heartbeat/plugins/InterfaceMgr/generic.so]
** (process:8115): DEBUG: Plugin path for InterfaceMgr/generic =>
[/usr/lib/heartbeat/plugins/InterfaceMgr/generic.so]
** (process:8115): DEBUG: PluginType InterfaceMgr already present
** (process:8115): DEBUG: Plugin InterfaceMgr/generic init function:
InterfaceMgr_LTX_generic_pil_plugin_init
** (process:8115): DEBUG: NewPILPlugin(0x8f6a1d8)
** (process:8115): DEBUG: Plugin InterfaceMgr/generic loaded and
constructed.
** (process:8115): DEBUG: Calling init function in plugin
InterfaceMgr/generic.
** (process:8115): DEBUG: NewPILInterface(0x8f69cd8)
** (process:8115): DEBUG:
NewPILInterface(0x8f69cd8:InterfaceMgr/stonith2)*** user_data: 0x8f69b18
*******
** (process:8115): DEBUG: Registering Implementation manager for Interface
type 'stonith2'
** (process:8115): DEBUG: IfIncrRefCount(1 + 1 )
** (process:8115): DEBUG: PluginIncrRefCount(0 + 1 )
** (process:8115): DEBUG: IfIncrRefCount(1 + 100 )
** (process:8115): DEBUG: PILS: Looking for stonith2/external =>
[/usr/lib/stonith/plugins/stonith2/external.so]
** (process:8115): DEBUG: Plugin path for stonith2/external =>
[/usr/lib/stonith/plugins/stonith2/external.so]
** (process:8115): DEBUG: Creating PluginType for stonith2
** (process:8115): DEBUG: NewPILPlugintype(0x8f6a398)
** (process:8115): DEBUG: Plugin stonith2/external init function:
stonith2_LTX_external_pil_plugin_init
** (process:8115): DEBUG: NewPILPlugin(0x8f69d68)
** (process:8115): DEBUG: Plugin stonith2/external loaded and constructed.
** (process:8115): DEBUG: Calling init function in plugin stonith2/external.
** (process:8115): DEBUG: NewPILInterface(0x8f6a3b0)
** (process:8115): DEBUG: NewPILInterface(0x8f6a3b0:stonith2/external)***
user_data: 0x9e9fbc *******
** (process:8115): DEBUG: IfIncrRefCount(101 + 1 )
** (process:8115): DEBUG: PluginIncrRefCount(0 + 1 )
** (process:8115): DEBUG: external_set_config: called.
** (process:8115): DEBUG: external_get_confignames: called.
** (process:8115): DEBUG: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu getconfignames'
** (process:8115): DEBUG: external_run_cmd:
'/usr/lib/stonith/plugins/external/rackpdu getconfignames' output: hostlist
pduip community
** (process:8115): DEBUG: external_get_confignames: 'rackpdu getconfignames'
returned 0
** (process:8115): DEBUG: plugin output: hostlist pduip community
** (process:8115): DEBUG: external_get_confignames: rackpdu configname
hostlist
** (process:8115): DEBUG: external_get_confignames: rackpdu configname pduip
** (process:8115): DEBUG: external_get_confignames: rackpdu configname
community
** (process:8115): DEBUG: external_status: called.
** (process:8115): DEBUG: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu status'
** INFO: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu status' returned 256
** (process:8115): CRITICAL **: external_status: 'rackpdu status' failed
with rc 256
** (process:8115): DEBUG: external_getinfo: called.
** (process:8115): DEBUG: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu getinfo-devid'
** (process:8115): DEBUG: external_run_cmd:
'/usr/lib/stonith/plugins/external/rackpdu getinfo-devid' output: rackpdu
STONITH device
** (process:8115): DEBUG: external_getinfo: 'rackpdu getinfo-devid' returned
0
** (process:8115): DEBUG: external_hostlist: called.
** (process:8115): DEBUG: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu gethosts'
** (process:8115): DEBUG: external_run_cmd:
'/usr/lib/stonith/plugins/external/rackpdu gethosts' output: node-01
node-02
node-03
** (process:8115): DEBUG: external_hostlist: running 'rackpdu gethosts'
returned 0
** (process:8115): DEBUG: external_hostlist: rackpdu host node-01
** (process:8115): DEBUG: external_hostlist: rackpdu host node-02
** (process:8115): DEBUG: external_hostlist: rackpdu host node-03
node-01
node-02
node-03
** (process:8115): DEBUG: external_destroy: called.
** (process:8115): DEBUG: IfIncrRefCount(1 + -1 )
** (process:8115): DEBUG: RemoveAPILInterface(0x8f6a3b0/external)
** (process:8115): DEBUG: RmAPILInterface(0x8f6a3b0/external)
** (process:8115): DEBUG: PILunregister_interface(stonith2/external)
** (process:8115): DEBUG: Calling InterfaceClose on stonith2/external
** (process:8115): DEBUG: IfIncrRefCount(102 + -1 )
** (process:8115): DEBUG: PluginIncrRefCount(1 + -1 )
** (process:8115): DEBUG: RemoveAPILPlugin(stonith2/external)
** (process:8115): DEBUG: RmAPILPlugin(stonith2/external)
** (process:8115): DEBUG: Closing dlhandle for (stonith2/external)
** (process:8115): DEBUG: RmAPILPluginType(stonith2)
** (process:8115): DEBUG: DelPILPluginType(stonith2)
** (process:8115): DEBUG: DelPILInterface(0x8f6a3b0/external)
[root at node-01 ~]# stonith -t external/rackpdu
hostlist="node-01,node-02,node-03" pduip="192.168.100.100"
community="empisteftiko" names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4" -l
** INFO: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu status' returned 256
** (process:8814): CRITICAL **: external_status: 'rackpdu status' failed
with rc 256
node-01
node-02
node-03
and invoke the rackpdu directly
[root at node-01 ~]# /usr/lib/stonith/plugins/external/rackpdu status
[root at node-01 ~]# echo $?
1
in the following is the log when I try to start the resource
Nov 02 12:55:58 node-01 crmd: [19385]: info: do_lrm_rsc_op: Performing
key=108:59:0:569e2e9c-9272-4bd3-a262-b971cd349522 op=pdu_start_0 )
Nov 02 12:55:58 node-01 lrmd: [19382]: info: rsc:pdu:27: start
Nov 02 12:55:58 node-01 lrmd: [9248]: info: Try to start STONITH resource
<rsc_id=pdu> : Device=external/rackpdu
Nov 02 12:56:00 node-01 stonithd: [9254]: info: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu status' returned 256
Nov 02 12:56:00 node-01 stonithd: [9254]: CRIT: external_status: 'rackpdu
status' failed with rc 256
Nov 02 12:56:00 node-01 stonithd: [19383]: WARN: start pdu failed, because
its hostlist is empty
Nov 02 12:56:00 node-01 crmd: [19385]: info: process_lrm_event: LRM
operation pdu_start_0 (call=27, rc=1, cib-update=49, confirmed=true) unknown
error
Nov 02 12:56:03 node-01 attrd: [19384]: info: attrd_trigger_update: Sending
flush op to all hosts for: fail-count-pdu (INFINITY)
Nov 02 12:56:03 node-01 crmd: [19385]: info: do_lrm_rsc_op: Performing
key=7:60:0:569e2e9c-9272-4bd3-a262-b971cd349522 op=pdu_stop_0 )
Nov 02 12:56:03 node-01 lrmd: [19382]: info: rsc:pdu:28: stop
Nov 02 12:56:03 node-01 lrmd: [9309]: info: Try to stop STONITH resource
<rsc_id=pdu> : Device=external/rackpdu
Nov 02 12:56:03 node-01 stonithd: [19383]: notice: try to stop a resource
pdu who is not in started resource queue.
Nov 02 12:56:03 node-01 crmd: [19385]: info: process_lrm_event: LRM
operation pdu_stop_0 (call=28, rc=0, cib-update=50, confirmed=true) ok
Nov 02 12:56:03 node-01 attrd: [19384]: info: attrd_perform_update: Sent
update 300: fail-count-pdu=INFINITY
Nov 02 12:56:03 node-01 attrd: [19384]: info: attrd_trigger_update: Sending
flush op to all hosts for: last-failure-pdu (1288698962)
Nov 02 12:56:03 node-01 attrd: [19384]: info: attrd_perform_update: Sent
update 302: last-failure-pdu=1288698962
Nov 02 12:56:04 node-01 lrmd: [19382]: info: rsc:pdu:29: start
Nov 02 12:56:04 node-01 crmd: [19385]: info: do_lrm_rsc_op: Performing
key=109:60:0:569e2e9c-9272-4bd3-a262-b971cd349522 op=pdu_start_0 )
Nov 02 12:56:04 node-01 lrmd: [9311]: info: Try to start STONITH resource
<rsc_id=pdu> : Device=external/rackpdu
Nov 02 12:56:06 node-01 stonithd: [9316]: info: external_run_cmd: Calling
'/usr/lib/stonith/plugins/external/rackpdu status' returned 256
Nov 02 12:56:06 node-01 stonithd: [9316]: CRIT: external_status: 'rackpdu
status' failed with rc 256
Nov 02 12:56:06 node-01 stonithd: [19383]: WARN: start pdu failed, because
its hostlist is empty
Nov 02 12:56:06 node-01 crmd: [19385]: info: process_lrm_event: LRM
operation pdu_start_0 (call=29, rc=1, cib-update=51, confirmed=true) unknown
error
Nov 02 12:56:08 node-01 attrd: [19384]: info: attrd_trigger_update: Sending
flush op to all hosts for: last-failure-pdu (1288698969)
Nov 02 12:56:08 node-01 crmd: [19385]: info: do_lrm_rsc_op: Performing
key=7:61:0:569e2e9c-9272-4bd3-a262-b971cd349522 op=pdu_stop_0 )
Nov 02 12:56:08 node-01 lrmd: [19382]: info: rsc:pdu:30: stop
Nov 02 12:56:08 node-01 lrmd: [9358]: info: Try to stop STONITH resource
<rsc_id=pdu> : Device=external/rackpdu
Nov 02 12:56:08 node-01 stonithd: [19383]: notice: try to stop a resource
pdu who is not in started resource queue.
Nov 02 12:56:08 node-01 crmd: [19385]: info: process_lrm_event: LRM
operation pdu_stop_0 (call=30, rc=0, cib-update=52, confirmed=true) ok
Nov 02 12:56:08 node-01 attrd: [19384]: info: attrd_perform_update: Sent
update 304: last-failure-pdu=1288698969
Nov 02 12:56:34 node-01 crmd: [19385]: info: do_lrm_invoke: Removing
resource pdu from the LRM
Nov 02 12:56:34 node-01 crmd: [19385]: info: do_lrm_invoke: Resource 'pdu'
deleted for 9638_crm_resource on node-01
Nov 02 12:56:34 node-01 crmd: [19385]: info: notify_deleted: Notifying
9638_crm_resource on node-01 that pdu was deleted
Nov 02 12:56:34 node-01 crmd: [19385]: info: send_direct_ack: ACK'ing
resource op pdu_delete_60000 from 0:0:crm-resource-9638:
lrm_invoke-lrmd-1288698994-27
conf bit
primitive pdu stonith:external/rackpdu \
params community="empisteftiko"
names_oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.4"
oid=".1.3.6.1.4.1.318.1.1.4.4.2.1.3" hostlist="node-01,node-02,node-03"
pduip="192.168.100.100" stonith-timeout="30" \
op monitor interval="1m" timeout="60s" \
meta target-role="Stopped"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20101102/23e13835/attachment-0002.htm>
More information about the Pacemaker
mailing list