[Pacemaker] another "node rebooting too quickly" bug?

Brian J. Murrell brian at interlinx.bc.ca
Thu Apr 24 15:39:44 EDT 2014


Hi,

As was previously discussed there is a bug in the handling of a STONITH
if a node reboots too quickly.  I had a different kind of failure that I
suspect is the same kind of problem, just different symptom.

The situation is a two node cluster with two resources plus a fencing
resource.  Each node is running one of the two resources and one is
running the fencing resource.  One of the nodes is killed.  The
remaining node should notice this, STONITH it and take over the
resource, but what happens is all of the above except the resource is
restarted on the node that was killed.

Here's the log from the surviving node -- the one that issues the
STONITH and *should* take over the resource:

Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Transitional membership event on ring 20: memb=1, new=0, lost=1
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: memb: node2 2085752330
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: lost: node1 2068975114
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Stable membership event on ring 20: memb=1, new=0, lost=0
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: MEMB: node2 2085752330
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: ais_mark_unseen_peer_dead: Node node1 was not seen in the previous transition
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: update_member: Node 2068975114/node1 is now: lost
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: send_member_notification: Sending membership update 20 to 2 children
Apr 22 19:47:26 node2 corosync[3204]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.

node1 is gone

Apr 22 19:47:26 node2 corosync[3204]:   [CPG   ] chosen downlist: sender r(0) ip(10.14.82.124) r(1) ip(10.128.2.124) ; members(old:2 left:1)
Apr 22 19:47:26 node2 corosync[3204]:   [MAIN  ] Completed service synchronization, ready to provide service.
Apr 22 19:47:26 node2 crmd[3255]:   notice: plugin_handle_membership: Membership 20: quorum lost
Apr 22 19:47:26 node2 crmd[3255]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now lost (was member)
Apr 22 19:47:26 node2 crmd[3255]:  warning: match_down_event: No match for shutdown action on node1
Apr 22 19:47:26 node2 crmd[3255]:   notice: peer_update_callback: Stonith/shutdown of node1 not matched

Is this "Stonith/shutdown of node1 not matched" because the fencing
resource was running on node1 but now it's gone?

Apr 22 19:47:26 node2 cib[3250]:   notice: plugin_handle_membership: Membership 20: quorum lost
Apr 22 19:47:26 node2 cib[3250]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now lost (was member)
Apr 22 19:47:26 node2 crmd[3255]:   notice: do_state_transition: State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=check_join_state ]
Apr 22 19:47:26 node2 attrd[3253]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)
Apr 22 19:47:26 node2 attrd[3253]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Apr 22 19:47:26 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:47:26 node2 pengine[3254]:  warning: pe_fence_node: Node node1 will be fenced because the node is no longer part of the cluster
Apr 22 19:47:26 node2 pengine[3254]:  warning: determine_online_status: Node node1 is unclean
Apr 22 19:47:26 node2 pengine[3254]:  warning: custom_action: Action st-fencing_stop_0 on node1 is unrunnable (offline)
Apr 22 19:47:26 node2 pengine[3254]:  warning: custom_action: Action resource1_stop_0 on node1 is unrunnable (offline)

Resources from node1 unrunnable because it's AWOL!

Apr 22 19:47:26 node2 pengine[3254]:  warning: stage6: Scheduling Node node1 for STONITH

So kill it!

Apr 22 19:47:26 node2 pengine[3254]:   notice: LogActions: Move    st-fencing#011(Started node1 -> node2)
Apr 22 19:47:26 node2 pengine[3254]:   notice: LogActions: Move    resource1#011(Started node1 -> node2)

So here are the resources that were on node1 being scheduled to move to
node2.

Apr 22 19:47:26 node2 crmd[3255]:   notice: te_fence_node: Executing reboot fencing operation (13) on node1 (timeout=60000)
Apr 22 19:47:26 node2 stonith-ng[3251]:   notice: handle_request: Client crmd.3255.0a3d203a wants to fence (reboot) 'node1' with device '(any)'
Apr 22 19:47:26 node2 stonith-ng[3251]:   notice: initiate_remote_stonith_op: Initiating remote operation reboot for node1: 7660cf0e-10da-4e96-b3bd-4b415c72201c (0)

The actual killing of node1.

Apr 22 19:47:26 node2 pengine[3254]:  warning: process_pe_message: Calculated Transition 23: /var/lib/pacemaker/pengine/pe-warn-0.bz2
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Transitional membership event on ring 24: memb=1, new=0, lost=0
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: memb: node2 2085752330
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Stable membership event on ring 24: memb=2, new=1, lost=0
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: update_member: Node 2068975114/node1 is now: member

And this is where node1 comes back.

Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: NEW:  node1 2068975114
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: MEMB: node1 2068975114
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: MEMB: node2 2085752330
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: send_member_notification: Sending membership update 24 to 2 children
Apr 22 19:48:22 node2 corosync[3204]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: update_member: 0x170c920 Node 2068975114 (node1) born on: 24
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] info: send_member_notification: Sending membership update 24 to 2 children
Apr 22 19:48:22 node2 cib[3250]:   notice: plugin_handle_membership: Membership 24: quorum acquired
Apr 22 19:48:22 node2 cib[3250]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now member (was lost)

And the CIB sees it.

Apr 22 19:48:22 node2 crmd[3255]:   notice: plugin_handle_membership: Membership 24: quorum acquired
Apr 22 19:48:22 node2 crmd[3255]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now member (was lost)
Apr 22 19:48:22 node2 corosync[3204]:   [CPG   ] chosen downlist: sender r(0) ip(10.14.82.124) r(1) ip(10.128.2.124) ; members(old:1 left:0)
Apr 22 19:48:22 node2 corosync[3204]:   [MAIN  ] Completed service synchronization, ready to provide service.
Apr 22 19:48:43 node2 corosync[3204]:   [TOTEM ] A processor failed, forming new configuration.

And it goes AWOL again?  It shouldn't have.

Apr 22 19:48:44 node2 stonith-ng[3251]:   notice: log_operation: Operation 'reboot' [4110] (call 2 from crmd.3255) for host 'node1' with device 'st-fencing' returned: 0 (OK)
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Transitional membership event on ring 28: memb=1, new=0, lost=1
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: memb: node2 2085752330
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: lost: node1 2068975114
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Stable membership event on ring 28: memb=1, new=0, lost=0
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: MEMB: node2 2085752330
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] info: ais_mark_unseen_peer_dead: Node node1 was not seen in the previous transition
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] info: update_member: Node 2068975114/node1 is now: lost
Apr 22 19:48:49 node2 corosync[3204]:   [pcmk  ] info: send_member_notification: Sending membership update 28 to 2 children
Apr 22 19:48:49 node2 corosync[3204]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 22 19:48:49 node2 crmd[3255]:   notice: plugin_handle_membership: Membership 28: quorum lost
Apr 22 19:48:49 node2 crmd[3255]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now lost (was member)
Apr 22 19:48:49 node2 cib[3250]:   notice: plugin_handle_membership: Membership 28: quorum lost
Apr 22 19:48:49 node2 cib[3250]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now lost (was member)
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Transitional membership event on ring 32: memb=1, new=0, lost=0
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: memb: node2 2085752330
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: Stable membership event on ring 32: memb=2, new=1, lost=0
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: update_member: Node 2068975114/node1 is now: member
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: NEW:  node1 2068975114
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: MEMB: node1 2068975114
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: MEMB: node2 2085752330
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: send_member_notification: Sending membership update 32 to 2 children
Apr 22 19:49:26 node2 corosync[3204]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: update_member: 0x170c920 Node 2068975114 (node1) born on: 32
Apr 22 19:49:26 node2 corosync[3204]:   [pcmk  ] info: send_member_notification: Sending membership update 32 to 2 children
Apr 22 19:49:26 node2 crmd[3255]:   notice: plugin_handle_membership: Membership 32: quorum acquired
Apr 22 19:49:26 node2 crmd[3255]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now member (was lost)
Apr 22 19:49:26 node2 cib[3250]:   notice: plugin_handle_membership: Membership 32: quorum acquired
Apr 22 19:49:26 node2 cib[3250]:   notice: crm_update_peer_state: plugin_handle_membership: Node node1[2068975114] - state is now member (was lost)

And it's back again?

Apr 22 19:49:26 node2 corosync[3204]:   [CPG   ] chosen downlist: sender r(0) ip(10.14.82.124) r(1) ip(10.128.2.124) ; members(old:2 left:1)
Apr 22 19:49:26 node2 corosync[3204]:   [MAIN  ] Completed service synchronization, ready to provide service.
Apr 22 19:49:26 node2 stonith-ng[3251]:   notice: remote_op_done: Operation reboot of node1 by node2 for crmd.3255 at node2.7660cf0e: OK
Apr 22 19:49:26 node2 crmd[3255]:   notice: tengine_stonith_callback: Stonith operation 2/13:23:0:e63ab1bb-b839-48d5-9a4d-b27677d3787f: OK (0)
Apr 22 19:49:26 node2 crmd[3255]:   notice: tengine_stonith_notify: Peer node1 was terminated (reboot) by node2 for node2: OK (ref=7660cf0e-10da-4e96-b3bd-4b415c72201c) by client crmd.3255
Apr 22 19:49:26 node2 crmd[3255]:   notice: run_graph: Transition 23 (Complete=1, Pending=0, Fired=0, Skipped=7, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-warn-0.bz2): Stopped
Apr 22 19:49:33 node2 attrd[3253]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)
Apr 22 19:49:33 node2 attrd[3253]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Apr 22 19:49:34 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:49:34 node2 pengine[3254]:   notice: LogActions: Start   st-fencing#011(node1)
Apr 22 19:49:34 node2 pengine[3254]:   notice: LogActions: Start   resource1#011(node1)

In any case, here we can see that the resource1 and the fencing resource
were started up on node1 again.

Why would these start back up on a node that has clearly failed and not
started on this node, node2?

Apr 22 19:49:34 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 5: monitor st-fencing_monitor_0 on node1
Apr 22 19:49:34 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 6: monitor resource1_monitor_0 on node1
Apr 22 19:49:34 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 7: monitor resource2_monitor_0 on node1
Apr 22 19:49:34 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 24: /var/lib/pacemaker/pengine/pe-input-95.bz2
Apr 22 19:49:40 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 4: probe_complete probe_complete on node1 - no waiting
Apr 22 19:49:40 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 9: start st-fencing_start_0 on node1
Apr 22 19:49:40 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 10: start resource1_start_0 on node1
Apr 22 19:50:24 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 11: monitor resource1_monitor_5000 on node1

And here is the cluster configuration messages so that you can see how
resource1 and resource2 were added and what constraints were set on
them, etc.:

Apr 22 19:45:46 node2 cib[3250]:   notice: log_cib_diff: cib:diff: Local-only Change: 0.23.1
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: -- <cib admin_epoch="0" epoch="22" num_updates="1"/>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++       <primitive class="ocf" provider="foo" type="Target" id="resource1">
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++         <meta_attributes id="resource1-meta_attributes">
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++           <nvpair name="target-role" id="resource1-meta_attributes-target-role" value="Stopped"/>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++         </meta_attributes>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++         <operations id="resource1-operations">
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++           <op id="resource1-monitor-5" interval="5" name="monitor" timeout="60"/>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++           <op id="resource1-start-0" interval="0" name="start" timeout="300"/>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++           <op id="resource1-stop-0" interval="0" name="stop" timeout="300"/>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++         </operations>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++         <instance_attributes id="resource1-instance_attributes">
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++           <nvpair id="resource1-instance_attributes-target" name="target" value="bddf8d7c-5b5c-4c24-9e55-4704dd336a8b"/>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++         </instance_attributes>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++       </primitive>
Apr 22 19:45:46 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:46 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 6: monitor resource1_monitor_0 on node2 (local)
Apr 22 19:45:46 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 15: /var/lib/pacemaker/pengine/pe-input-87.bz2
Apr 22 19:45:46 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 4: monitor resource1_monitor_0 on node1
Apr 22 19:45:46 node2 cib[3250]:   notice: log_cib_diff: cib:diff: Local-only Change: 0.24.1
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: -- <cib admin_epoch="0" epoch="23" num_updates="1"/>
Apr 22 19:45:46 node2 cib[3250]:   notice: cib:diff: ++       <rsc_location id="resource1-primary" node="node1" rsc="resource1" score="20"/>
Apr 22 19:45:46 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 3: probe_complete probe_complete on node1 - no waiting
Apr 22 19:45:46 node2 crmd[3255]:   notice: process_lrm_event: LRM operation resource1_monitor_0 (call=10, rc=7, cib-update=58, confirmed=true) not running
Apr 22 19:45:46 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 5: probe_complete probe_complete on node2 (local) - no waiting
Apr 22 19:45:46 node2 crmd[3255]:   notice: run_graph: Transition 15 (Complete=4, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-87.bz2): Stopped
Apr 22 19:45:46 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:46 node2 crmd[3255]:   notice: run_graph: Transition 16 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-88.bz2): Complete
Apr 22 19:45:46 node2 crmd[3255]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Apr 22 19:45:46 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 16: /var/lib/pacemaker/pengine/pe-input-88.bz2
Apr 22 19:45:47 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)
Apr 22 19:45:47 node2 stonith-ng[3251]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:47 node2 cibadmin[3520]:   notice: crm_log_args: Invoked: cibadmin -o constraints -C -X <rsc_location id="resource1-secondary" node="node2" rsc="resource1" score="10"/> 
Apr 22 19:45:47 node2 crmd[3255]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Apr 22 19:45:47 node2 cib[3250]:   notice: cib:diff: Diff: --- 0.24.3
Apr 22 19:45:47 node2 cib[3250]:   notice: cib:diff: Diff: +++ 0.25.1 1e5f079657dde6f6085a149c0cc5398b
Apr 22 19:45:47 node2 cib[3250]:   notice: cib:diff: -- <cib admin_epoch="0" epoch="24" num_updates="3"/>
Apr 22 19:45:47 node2 cib[3250]:   notice: cib:diff: ++       <rsc_location id="resource1-secondary" node="node2" rsc="resource1" score="10"/>
Apr 22 19:45:47 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:47 node2 crmd[3255]:   notice: run_graph: Transition 17 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-89.bz2): Complete
Apr 22 19:45:47 node2 crmd[3255]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Apr 22 19:45:47 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 17: /var/lib/pacemaker/pengine/pe-input-89.bz2
Apr 22 19:45:48 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)
Apr 22 19:45:48 node2 stonith-ng[3251]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:48 node2 crmd[3255]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Apr 22 19:45:48 node2 cib[3250]:   notice: log_cib_diff: cib:diff: Local-only Change: 0.26.1
Apr 22 19:45:48 node2 cib[3250]:   notice: cib:diff: --           <nvpair value="Stopped" id="resource1-meta_attributes-target-role"/>
Apr 22 19:45:48 node2 cib[3250]:   notice: cib:diff: ++           <nvpair name="target-role" id="resource1-meta_attributes-target-role" value="Started"/>
Apr 22 19:45:48 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:48 node2 pengine[3254]:   notice: LogActions: Start   resource1#011(node1)
Apr 22 19:45:48 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 7: start resource1_start_0 on node1
Apr 22 19:45:48 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 18: /var/lib/pacemaker/pengine/pe-input-90.bz2
Apr 22 19:45:49 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 8: monitor resource1_monitor_5000 on node1
Apr 22 19:45:49 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)
Apr 22 19:45:49 node2 stonith-ng[3251]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:49 node2 crmd[3255]:   notice: run_graph: Transition 18 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-90.bz2): Complete
Apr 22 19:45:49 node2 crmd[3255]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Apr 22 19:45:50 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)
Apr 22 19:45:53 node2 cibadmin[3597]:   notice: crm_log_args: Invoked: cibadmin -o resources -C -x /tmp/tmpoSwAoH 
Apr 22 19:45:53 node2 crmd[3255]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Apr 22 19:45:53 node2 stonith-ng[3251]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: Diff: --- 0.26.3
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: Diff: +++ 0.27.1 c0a07b74061c440082677bacfd9eff2a
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: -- <cib admin_epoch="0" epoch="26" num_updates="3"/>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++       <primitive class="ocf" provider="foo" type="Target" id="resource2">
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++         <meta_attributes id="resource2-meta_attributes">
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++           <nvpair name="target-role" id="resource2-meta_attributes-target-role" value="Stopped"/>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++         </meta_attributes>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++         <operations id="resource2-operations">
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++           <op id="resource2-monitor-5" interval="5" name="monitor" timeout="60"/>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++           <op id="resource2-start-0" interval="0" name="start" timeout="300"/>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++           <op id="resource2-stop-0" interval="0" name="stop" timeout="300"/>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++         </operations>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++         <instance_attributes id="resource2-instance_attributes">
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++           <nvpair id="resource2-instance_attributes-target" name="target" value="38716497-b4a2-4ad7-8307-d9829857c7fc"/>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++         </instance_attributes>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++       </primitive>
Apr 22 19:45:53 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:53 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 19: /var/lib/pacemaker/pengine/pe-input-91.bz2
Apr 22 19:45:53 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 7: monitor resource2_monitor_0 on node2 (local)
Apr 22 19:45:53 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 5: monitor resource2_monitor_0 on node1
Apr 22 19:45:53 node2 cibadmin[3601]:   notice: crm_log_args: Invoked: cibadmin -o constraints -C -X <rsc_location id="resource2-primary" node="node2" rsc="resource2" score="20"/> 
Apr 22 19:45:53 node2 cib[3250]:   notice: log_cib_diff: cib:diff: Local-only Change: 0.28.1
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: -- <cib admin_epoch="0" epoch="27" num_updates="1"/>
Apr 22 19:45:53 node2 cib[3250]:   notice: cib:diff: ++       <rsc_location id="resource2-primary" node="node2" rsc="resource2" score="20"/>
Apr 22 19:45:54 node2 crmd[3255]:   notice: process_lrm_event: LRM operation resource2_monitor_0 (call=15, rc=7, cib-update=63, confirmed=true) not running
Apr 22 19:45:54 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 6: probe_complete probe_complete on node2 (local) - no waiting
Apr 22 19:45:54 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 4: probe_complete probe_complete on node1 - no waiting
Apr 22 19:45:54 node2 crmd[3255]:   notice: run_graph: Transition 19 (Complete=4, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-91.bz2): Stopped
Apr 22 19:45:54 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:54 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 20: /var/lib/pacemaker/pengine/pe-input-92.bz2
Apr 22 19:45:54 node2 crmd[3255]:   notice: run_graph: Transition 20 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-92.bz2): Complete
Apr 22 19:45:54 node2 crmd[3255]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Apr 22 19:45:54 node2 crmd[3255]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Apr 22 19:45:54 node2 cib[3250]:   notice: cib:diff: Diff: --- 0.28.3
Apr 22 19:45:54 node2 cib[3250]:   notice: cib:diff: Diff: +++ 0.29.1 a5aa96d3ac5932d62dd542c74956adaa
Apr 22 19:45:54 node2 cib[3250]:   notice: cib:diff: -- <cib admin_epoch="0" epoch="28" num_updates="3"/>
Apr 22 19:45:54 node2 cib[3250]:   notice: cib:diff: ++       <rsc_location id="resource2-secondary" node="node1" rsc="resource2" score="10"/>
Apr 22 19:45:54 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:54 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 21: /var/lib/pacemaker/pengine/pe-input-93.bz2
Apr 22 19:45:54 node2 crmd[3255]:   notice: run_graph: Transition 21 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-93.bz2): Complete
Apr 22 19:45:54 node2 crmd[3255]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Apr 22 19:45:54 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)
Apr 22 19:45:54 node2 stonith-ng[3251]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:55 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)
Apr 22 19:45:55 node2 stonith-ng[3251]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:55 node2 crmd[3255]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Apr 22 19:45:55 node2 cib[3250]:   notice: log_cib_diff: cib:diff: Local-only Change: 0.30.1
Apr 22 19:45:55 node2 cib[3250]:   notice: cib:diff: --           <nvpair value="Stopped" id="resource2-meta_attributes-target-role"/>
Apr 22 19:45:55 node2 cib[3250]:   notice: cib:diff: ++           <nvpair name="target-role" id="resource2-meta_attributes-target-role" value="Started"/>
Apr 22 19:45:55 node2 pengine[3254]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:55 node2 pengine[3254]:   notice: LogActions: Start   resource2#011(node2)
Apr 22 19:45:55 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 10: start resource2_start_0 on node2 (local)
Apr 22 19:45:55 node2 pengine[3254]:   notice: process_pe_message: Calculated Transition 22: /var/lib/pacemaker/pengine/pe-input-94.bz2
Apr 22 19:45:56 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)
Apr 22 19:45:56 node2 stonith-ng[3251]:   notice: unpack_config: On loss of CCM Quorum: Ignore
Apr 22 19:45:56 node2 crmd[3255]:   notice: process_lrm_event: LRM operation resource2_start_0 (call=18, rc=0, cib-update=67, confirmed=true) ok
Apr 22 19:45:56 node2 crmd[3255]:   notice: te_rsc_command: Initiating action 11: monitor resource2_monitor_5000 on node2 (local)
Apr 22 19:45:57 node2 crmd[3255]:   notice: process_lrm_event: LRM operation resource2_monitor_5000 (call=21, rc=0, cib-update=68, confirmed=false) ok
Apr 22 19:45:57 node2 crmd[3255]:   notice: run_graph: Transition 22 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-94.bz2): Complete
Apr 22 19:45:57 node2 crmd[3255]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Apr 22 19:45:57 node2 stonith-ng[3251]:   notice: stonith_device_register: Added 'st-fencing' to the device list (1 active devices)


Cheers,
b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: This is a digitally signed message part
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140424/112582f8/attachment-0002.sig>


More information about the Pacemaker mailing list