[Pacemaker] unknown third node added to a 2 node cluster?
Brian J. Murrell (brian)
brian at interlinx.bc.ca
Tue Oct 7 15:09:01 UTC 2014
Given a 2 node pacemaker-1.1.10-14.el6_5.3 cluster with nodes "node5"
and "node6" I saw an "unknown" third node being added to the cluster,
but only on node5:
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 12: memb=2, new=0, lost=0
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: pcmk_peer_update: memb: node6 3713011210
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: pcmk_peer_update: memb: node5 3729788426
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 12: memb=3, new=1, lost=0
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: update_member: Creating entry for node 2085752330 born on 12
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: update_member: Node 2085752330/unknown is now: member
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 2085752330
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: pcmk_peer_update: MEMB: node6 3713011210
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: pcmk_peer_update: MEMB: node5 3729788426
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 2085752330
Above is where this third node seems to appear.
Sep 18 22:52:16 node5 corosync[17321]: [pcmk ] info: send_member_notification: Sending membership update 12 to 2 children
Sep 18 22:52:16 node5 corosync[17321]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
Sep 18 22:52:16 node5 cib[17371]: notice: crm_update_peer_state: plugin_handle_membership: Node (null)[2085752330] - state is now member (was (null))
Sep 18 22:52:16 node5 crmd[17376]: notice: crm_update_peer_state: plugin_handle_membership: Node (null)[2085752330] - state is now member (was (null))
Sep 18 22:52:16 node5 crmd[17376]: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Sep 18 22:52:16 node5 crmd[17376]: error: join_make_offer: No recipient for welcome message
Sep 18 22:52:16 node5 crmd[17376]: warning: do_state_transition: Only 2 of 3 cluster nodes are eligible to run resources - continue 0
Sep 18 22:52:16 node5 attrd[17374]: notice: attrd_local_callback: Sending full refresh (origin=crmd)
Sep 18 22:52:16 node5 attrd[17374]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Sep 18 22:52:16 node5 stonith-ng[17372]: notice: unpack_config: On loss of CCM Quorum: Ignore
Sep 18 22:52:16 node5 cib[17371]: notice: cib:diff: Diff: --- 0.31.2
Sep 18 22:52:16 node5 cib[17371]: notice: cib:diff: Diff: +++ 0.32.1 4a679012144955c802557a39707247a2
Sep 18 22:52:16 node5 cib[17371]: notice: cib:diff: -- <nvpair value="Stopped" id="res1-meta_attributes-target-role"/>
Sep 18 22:52:16 node5 cib[17371]: notice: cib:diff: ++ <nvpair name="target-role" id="res1-meta_attributes-target-role" value="Started"/>
Sep 18 22:52:16 node5 pengine[17375]: notice: unpack_config: On loss of CCM Quorum: Ignore
Sep 18 22:52:16 node5 pengine[17375]: notice: LogActions: Start res1#011(node5)
Sep 18 22:52:16 node5 crmd[17376]: notice: te_rsc_command: Initiating action 7: start res1_start_0 on node5 (local)
Sep 18 22:52:16 node5 pengine[17375]: notice: process_pe_message: Calculated Transition 22: /var/lib/pacemaker/pengine/pe-input-165.bz2
Sep 18 22:52:16 node5 stonith-ng[17372]: notice: stonith_device_register: Device 'st-fencing' already existed in device list (1 active devices)
On node6 at the same time the following was in the log:
Sep 18 22:52:15 node6 corosync[11178]: [TOTEM ] Incrementing problem counter for seqid 5 iface 10.128.0.221 to [1 of 10]
Sep 18 22:52:16 node6 corosync[11178]: [TOTEM ] Incrementing problem counter for seqid 8 iface 10.128.0.221 to [2 of 10]
Sep 18 22:52:17 node6 corosync[11178]: [TOTEM ] Decrementing problem counter for iface 10.128.0.221 to [1 of 10]
Sep 18 22:52:19 node6 corosync[11178]: [TOTEM ] ring 1 active with no faults
Any idea what's going on here?
Cheers,
b.
More information about the Pacemaker
mailing list