[ClusterLabs] data loss of network would cause Pacemaker exit abnormally
Ken Gaillot
kgaillot at redhat.com
Wed Aug 31 15:39:52 UTC 2016
On 08/30/2016 01:58 PM, chenhj wrote:
> Hi,
>
> This is a continuation of the email below(I did not subscrib this maillist)
>
> http://clusterlabs.org/pipermail/users/2016-August/003838.html
>
>>>From the above, I suspect that the node with the network loss was the
>>DC, and from its point of view, it was the other node that went away.
>
> Yes. the node with the network loss was DC(node2)
>
> Could someone explain what's the following messges means, and
> why pacemakerd process exit instead of rejoin to CPG group?
>
>> Aug 27 12:33:59 [46849] node3 pacemakerd: error: pcmk_cpg_membership:
>> We're not part of CPG group 'pacemakerd' anymore!
This means the node was kicked out of the membership. I don't remember
what that implies, I'm guessing the node exits because the cluster will
most likely fence it after kicking it out.
>
>>> [root at node3 ~]# rpm -q corosync
>>> corosync-1.4.1-7.el6.x86_64
>>That is quite old ...
>>> [root at node3 ~]# cat /etc/redhat-release
>>> CentOS release 6.3 (Final)
>>> [root at node3 ~]# pacemakerd -F
>> Pacemaker 1.1.14-1.el6 (Build: 70404b0)
>>and I doubt that many people have tested Pacemaker 1.1.14 against
>>corosync 1.4.1 ... quite far away from
>>each other release-wise ...
>
> pacemaker 1.1.14 + corosync-1.4.7 can also reproduced this probleam, but
> seems with lower probability.
The corosync 2 series is a major improvement, but some config changes
are necessary
More information about the Users
mailing list