[Pacemaker] unknown third node added to a 2 node cluster?

Andrew Beekhof andrew at beekhof.net
Tue Oct 7 21:39:55 EDT 2014


On 8 Oct 2014, at 2:09 am, Brian J. Murrell (brian) <brian at interlinx.bc.ca> wrote:

> Given a 2 node pacemaker-1.1.10-14.el6_5.3 cluster with nodes "node5"
> and "node6" I saw an "unknown" third node being added to the cluster,
> but only on node5:

Is either node using dhcp?
I would guess node6 got a new IP address (or that corosync decided to bind to a different one)

> 
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] notice: pcmk_peer_update: Transitional membership event on ring 12: memb=2, new=0, lost=0
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: memb: node6 3713011210
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: memb: node5 3729788426
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] notice: pcmk_peer_update: Stable membership event on ring 12: memb=3, new=1, lost=0
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: update_member: Creating entry for node 2085752330 born on 12
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: update_member: Node 2085752330/unknown is now: member
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: NEW:  .pending. 2085752330
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: MEMB: node6 3713011210
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: MEMB: node5 3729788426
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: MEMB: .pending. 2085752330
> 
> Above is where this third node seems to appear.
> 
> Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: send_member_notification: Sending membership update 12 to 2 children
> Sep 18 22:52:16 node5 corosync[17321]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
> Sep 18 22:52:16 node5 cib[17371]:   notice: crm_update_peer_state: plugin_handle_membership: Node (null)[2085752330] - state is now member (was (null))
> Sep 18 22:52:16 node5 crmd[17376]:   notice: crm_update_peer_state: plugin_handle_membership: Node (null)[2085752330] - state is now member (was (null))
> Sep 18 22:52:16 node5 crmd[17376]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
> Sep 18 22:52:16 node5 crmd[17376]:    error: join_make_offer: No recipient for welcome message
> Sep 18 22:52:16 node5 crmd[17376]:  warning: do_state_transition: Only 2 of 3 cluster nodes are eligible to run resources - continue 0
> Sep 18 22:52:16 node5 attrd[17374]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)
> Sep 18 22:52:16 node5 attrd[17374]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
> Sep 18 22:52:16 node5 stonith-ng[17372]:   notice: unpack_config: On loss of CCM Quorum: Ignore
> Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: Diff: --- 0.31.2
> Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: Diff: +++ 0.32.1 4a679012144955c802557a39707247a2
> Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: --           <nvpair value="Stopped" id="res1-meta_attributes-target-role"/>
> Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: ++           <nvpair name="target-role" id="res1-meta_attributes-target-role" value="Started"/>
> Sep 18 22:52:16 node5 pengine[17375]:   notice: unpack_config: On loss of CCM Quorum: Ignore
> Sep 18 22:52:16 node5 pengine[17375]:   notice: LogActions: Start   res1#011(node5)
> Sep 18 22:52:16 node5 crmd[17376]:   notice: te_rsc_command: Initiating action 7: start res1_start_0 on node5 (local)
> Sep 18 22:52:16 node5 pengine[17375]:   notice: process_pe_message: Calculated Transition 22: /var/lib/pacemaker/pengine/pe-input-165.bz2
> Sep 18 22:52:16 node5 stonith-ng[17372]:   notice: stonith_device_register: Device 'st-fencing' already existed in device list (1 active devices)
> 
> On node6 at the same time the following was in the log:
> 
> Sep 18 22:52:15 node6 corosync[11178]:   [TOTEM ] Incrementing problem counter for seqid 5 iface 10.128.0.221 to [1 of 10]
> Sep 18 22:52:16 node6 corosync[11178]:   [TOTEM ] Incrementing problem counter for seqid 8 iface 10.128.0.221 to [2 of 10]
> Sep 18 22:52:17 node6 corosync[11178]:   [TOTEM ] Decrementing problem counter for iface 10.128.0.221 to [1 of 10]
> Sep 18 22:52:19 node6 corosync[11178]:   [TOTEM ] ring 1 active with no faults
> 
> Any idea what's going on here?
> 
> Cheers,
> b.
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20141008/ad3bab4e/attachment-0007.sig>


More information about the Pacemaker mailing list