[Pacemaker] crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline
Dan Frincu
df.cluster at gmail.com
Thu Apr 19 13:51:42 UTC 2012
Hi,
On Thu, Apr 19, 2012 at 3:56 PM, Parshvi <parshvi.17 at gmail.com> wrote:
> 1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
> a. Use case:
> i. Two nodes in a cluster (Call them Node-1 and Node-2)
> ii. One interface configured in corosync.conf for its heartbeat or
> messaging. Eg. Bind net addr : 192.168.10.0
> iii. Another interface configured in /etc/hosts for hostname resolution.
> Eg. IP: 192.168.129.10 Hostname: Node-1
> Eg. IP: 192.168.129.11 Hostname: Node-2
> iv. Hence for all ssh communication between the two nodes, hostname resolves
> to subnet 129 address.
> v. 12 services configured in active/passive mode
> vi. 1 service configured in master/slave mode
> vii. 8 services are non-sticky (they failback) in active/passive
> viii. 4 services are sticky (do not failback) in active/passive
> ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
> sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
>
> b. Observations:
> i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
> Node-2 was configured.
> ii. On Node-1 all interfaces were up.
> iii. Interface used by corosync for hearbeat/messaging was up at all times
> (Bind net addr : 192.168.10.0)
> iv. In crm_mon: Node-1 sees Node-2 as offline
> cibadmin --query fails to work (remote node did not respond)
> v. In crm_mon: Node-2 sees Node-1 as online
> vi. All the services were seen active on Node-1 (including those that were
> preferred for Node-2). Observed in crm_mon output.
> vii. 4 services for which Node-2 was preferred were seen active Node-2 also
> (hence 4 services active on both the nodes).
> Observed in crm_mon output: Only 4 services were shown active, the status of
> the rest of the services active on Node-1 did not reflect in crm_mon
> Even though crm_mon on Node-2 sees Node-1 as “online”.
> c. Errors in log file:
> i. On Node-2:
> 1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
> 2. The above error appears for all the resources configured in pacemaker.
>
>
> Query:
> 1) For what purpose does Pacemaker require “ssh without a pass key” to be
> enabled between the nodes in a cluster ?
scp
> 2) For what purpose does Pacemaker use Node “hostname” for ? how Node “hostname”
> come into picture ?
When choosing where to allocate resources not explicitly tied to a node. See
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#node-score-equal
and
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#_background
> 3) Let’s say in a two node cluster two communication paths are available between
> the two nodes.
> a. Eth1 and eth2.
> b. The hostname of the node resolves to IP Address on eth1.
> c. Consider, eth1 (network cable disconnected) goes down.
> d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
> eth1 addr).
Inter-node communication is usually specified by IP address, and
redundant connections (as in your case) is recommended.
> e. Will this (hostname) have any issue ?
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
--
Dan Frincu
CCNA, RHCE
More information about the Pacemaker
mailing list