[Pacemaker] crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline
Parshvi
parshvi.17 at gmail.com
Thu Apr 19 12:56:11 UTC 2012
1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
a. Use case:
i. Two nodes in a cluster (Call them Node-1 and Node-2)
ii. One interface configured in corosync.conf for its heartbeat or
messaging. Eg. Bind net addr : 192.168.10.0
iii. Another interface configured in /etc/hosts for hostname resolution.
Eg. IP: 192.168.129.10 Hostname: Node-1
Eg. IP: 192.168.129.11 Hostname: Node-2
iv. Hence for all ssh communication between the two nodes, hostname resolves
to subnet 129 address.
v. 12 services configured in active/passive mode
vi. 1 service configured in master/slave mode
vii. 8 services are non-sticky (they failback) in active/passive
viii. 4 services are sticky (do not failback) in active/passive
ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
b. Observations:
i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
Node-2 was configured.
ii. On Node-1 all interfaces were up.
iii. Interface used by corosync for hearbeat/messaging was up at all times
(Bind net addr : 192.168.10.0)
iv. In crm_mon: Node-1 sees Node-2 as offline
cibadmin --query fails to work (remote node did not respond)
v. In crm_mon: Node-2 sees Node-1 as online
vi. All the services were seen active on Node-1 (including those that were
preferred for Node-2). Observed in crm_mon output.
vii. 4 services for which Node-2 was preferred were seen active Node-2 also
(hence 4 services active on both the nodes).
Observed in crm_mon output: Only 4 services were shown active, the status of
the rest of the services active on Node-1 did not reflect in crm_mon
Even though crm_mon on Node-2 sees Node-1 as “online”.
c. Errors in log file:
i. On Node-2:
1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
2. The above error appears for all the resources configured in pacemaker.
Query:
1) For what purpose does Pacemaker require “ssh without a pass key” to be
enabled between the nodes in a cluster ?
2) For what purpose does Pacemaker use Node “hostname” for ? how Node “hostname”
come into picture ?
3) Let’s say in a two node cluster two communication paths are available between
the two nodes.
a. Eth1 and eth2.
b. The hostname of the node resolves to IP Address on eth1.
c. Consider, eth1 (network cable disconnected) goes down.
d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
eth1 addr).
e. Will this (hostname) have any issue ?
More information about the Pacemaker
mailing list