[Pacemaker] crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline
Dan Frincu
df.cluster at gmail.com
Fri Apr 20 07:31:28 UTC 2012
On Fri, Apr 20, 2012 at 3:09 AM, Andrew Beekhof <andrew at beekhof.net> wrote:
> On Thu, Apr 19, 2012 at 11:51 PM, Dan Frincu <df.cluster at gmail.com> wrote:
>> Hi,
>>
>> On Thu, Apr 19, 2012 at 3:56 PM, Parshvi <parshvi.17 at gmail.com> wrote:
>>> 1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
>>> a. Use case:
>>> i. Two nodes in a cluster (Call them Node-1 and Node-2)
>>> ii. One interface configured in corosync.conf for its heartbeat or
>>> messaging. Eg. Bind net addr : 192.168.10.0
>>> iii. Another interface configured in /etc/hosts for hostname resolution.
>>> Eg. IP: 192.168.129.10 Hostname: Node-1
>>> Eg. IP: 192.168.129.11 Hostname: Node-2
>>> iv. Hence for all ssh communication between the two nodes, hostname resolves
>>> to subnet 129 address.
>>> v. 12 services configured in active/passive mode
>>> vi. 1 service configured in master/slave mode
>>> vii. 8 services are non-sticky (they failback) in active/passive
>>> viii. 4 services are sticky (do not failback) in active/passive
>>> ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
>>> sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
>>>
>>> b. Observations:
>>> i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
>>> Node-2 was configured.
>>> ii. On Node-1 all interfaces were up.
>>> iii. Interface used by corosync for hearbeat/messaging was up at all times
>>> (Bind net addr : 192.168.10.0)
>>> iv. In crm_mon: Node-1 sees Node-2 as offline
>>> cibadmin --query fails to work (remote node did not respond)
>>> v. In crm_mon: Node-2 sees Node-1 as online
>>> vi. All the services were seen active on Node-1 (including those that were
>>> preferred for Node-2). Observed in crm_mon output.
>>> vii. 4 services for which Node-2 was preferred were seen active Node-2 also
>>> (hence 4 services active on both the nodes).
>>> Observed in crm_mon output: Only 4 services were shown active, the status of
>>> the rest of the services active on Node-1 did not reflect in crm_mon
>>> Even though crm_mon on Node-2 sees Node-1 as “online”.
>>> c. Errors in log file:
>>> i. On Node-2:
>>> 1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
>>> 2. The above error appears for all the resources configured in pacemaker.
>>>
>>>
>>> Query:
>>> 1) For what purpose does Pacemaker require “ssh without a pass key” to be
>>> enabled between the nodes in a cluster ?
>>
>> scp
>
> But pacemaker doesn't use scp... or is this in relation to the
> clusters from scratch document?
It's in relation to the Clusters from Scratch document.
> -ECONFUSED
Sorry about that ;)
>
>>
>>> 2) For what purpose does Pacemaker use Node “hostname” for ? how Node “hostname”
>>> come into picture ?
>>
>> When choosing where to allocate resources not explicitly tied to a node. See
>>
>> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#node-score-equal
>>
>> and
>>
>> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#_background
>>
>>> 3) Let’s say in a two node cluster two communication paths are available between
>>> the two nodes.
>>> a. Eth1 and eth2.
>>> b. The hostname of the node resolves to IP Address on eth1.
>>> c. Consider, eth1 (network cable disconnected) goes down.
>>> d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
>>> eth1 addr).
>>
>> Inter-node communication is usually specified by IP address, and
>> redundant connections (as in your case) is recommended.
>>
>>> e. Will this (hostname) have any issue ?
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>> --
>> Dan Frincu
>> CCNA, RHCE
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
--
Dan Frincu
CCNA, RHCE
More information about the Pacemaker
mailing list