[ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)
Martin Schlegel
martin at nuboreto.org
Tue Oct 4 22:09:21 UTC 2016
Hello all,
I am trying to understand why the following 2 Corosync heartbeat ring failure
scenarios
I have been testing and hope somebody can explain why this makes any sense.
Consider the following cluster:
* 3x Nodes: A, B and C
* 2x NICs for each Node
* Corosync 2.3.5 configured with "rrp_mode: passive" and
udpu transport with ring id 0 and 1 on each node.
* On each node "corosync-cfgtool -s" shows:
[...] ring 0 active with no faults
[...] ring 1 active with no faults
Consider the following scenarios:
1. On node A only block all communication on the first NIC configured with
ring id 0
2. On node A only block all communication on all NICs configured with
ring id 0 and 1
The result of the above scenarios is as follows:
1. Nodes A, B and C (!) display the following ring status:
[...] Marking ringid 0 interface <IP-Address> FAULTY
[...] ring 1 active with no faults
2. Node A is shown as OFFLINE - B and C display the following ring status:
[...] ring 0 active with no faults
[...] ring 1 active with no faults
Questions:
1. Is this the expected outcome ?
2. In experiment 1. B and C can still communicate with each other over both
NICs, so why are
B and C not displaying a "no faults" status for ring id 0 and 1 just like
in experiment 2.
when node A is completely unreachable ?
Regards,
Martin Schlegel
More information about the Users
mailing list