[Pacemaker] different behavior cibadmin -Ql with cman and corosync2
Andrew Beekhof
andrew at beekhof.net
Thu Aug 29 23:12:47 EDT 2013
On 29/08/2013, at 7:31 PM, Andrey Groshev <greenx at yandex.ru> wrote:
>
>
> 29.08.2013, 12:25, "Andrey Groshev" <greenx at yandex.ru>:
>> 29.08.2013, 02:55, "Andrew Beekhof" <andrew at beekhof.net>:
>>
>>> On 28/08/2013, at 5:38 PM, Andrey Groshev <greenx at yandex.ru> wrote:
>>>> 28.08.2013, 04:06, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>> On 27/08/2013, at 1:13 PM, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>> 27.08.2013, 05:39, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>> On 26/08/2013, at 3:09 PM, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>> 26.08.2013, 03:34, "Andrew Beekhof" <andrew at beekhof.net>:
>>>>>>>>> On 23/08/2013, at 9:39 PM, Andrey Groshev <greenx at yandex.ru> wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Today I try remake my test cluster from cman to corosync2.
>>>>>>>>>> I drew attention to the following:
>>>>>>>>>> If I reset cluster with cman through cibadmin --erase --force
>>>>>>>>>> In cib is still there exist names of nodes.
>>>>>>>>> Yes, the cluster puts back entries for all the nodes it know about automagically.
>>>>>>>>>> cibadmin -Ql
>>>>>>>>>> .....
>>>>>>>>>> <nodes>
>>>>>>>>>> <node id="dev-cluster2-node2.unix.tensor.ru" uname="dev-cluster2-node2"/>
>>>>>>>>>> <node id="dev-cluster2-node4.unix.tensor.ru" uname="dev-cluster2-node4"/>
>>>>>>>>>> <node id="dev-cluster2-node3.unix.tensor.ru" uname="dev-cluster2-node3"/>
>>>>>>>>>> </nodes>
>>>>>>>>>> ....
>>>>>>>>>>
>>>>>>>>>> Even if cman and pacemaker running only one node.
>>>>>>>>> I'm assuming all three are configured in cluster.conf?
>>>>>>>> Yes, there exist list nodes.
>>>>>>>>>> And if I do too on cluster with corosync2
>>>>>>>>>> I see only names of nodes which run corosync and pacemaker.
>>>>>>>>> Since you're not included your config, I can only guess that your corosync.conf does not have a nodelist.
>>>>>>>>> If it did, you should get the same behaviour.
>>>>>>>> I try and expected_node and nodelist.
>>>>>>> And it didn't work? What version of pacemaker?
>>>>>> It does not work as I expected.
>>>>> Thats because you've used IP addresses in the node list.
>>>>> ie.
>>>>>
>>>>> node {
>>>>> ring0_addr: 10.76.157.17
>>>>> }
>>>>>
>>>>> try including the node name as well, eg.
>>>>>
>>>>> node {
>>>>> name: dev-cluster2-node2
>>>>> ring0_addr: 10.76.157.17
>>>>> }
>>>> The same thing.
>>> I don't know what to say. I tested it here yesterday and it worked as expected.
>>
>> I found that the reason that You and I have different results - I did not have reverse DNS zone for these nodes.
>> I know what it should be, but (PACEMAKER + CMAN) worked without a reverse area!
>>
>
> Hasty. Deleted all. Reinstalled. Configured. Not working again. Damn!
It would have surprised me... pacemaker 1.1.11 doesn't do any dns lookups - reverse or otherwise.
Can you set
PCMK_trace_files=corosync.c
in your environment and retest?
On RHEL6 that means putting the following in /etc/sysconfig/pacemaker
export PCMK_trace_files=corosync.c
It should produce additional logging[1] that will help diagnose the issue.
[1] http://blog.clusterlabs.org/blog/2013/pacemaker-logging/
>
>>>> # corosync-cmapctl |grep nodelist
>>>> nodelist.local_node_pos (u32) = 2
>>>> nodelist.node.0.name (str) = dev-cluster2-node2
>>>> nodelist.node.0.ring0_addr (str) = 10.76.157.17
>>>> nodelist.node.1.name (str) = dev-cluster2-node3
>>>> nodelist.node.1.ring0_addr (str) = 10.76.157.18
>>>> nodelist.node.2.name (str) = dev-cluster2-node4
>>>> nodelist.node.2.ring0_addr (str) = 10.76.157.19
>>>>
>>>> # corosync-quorumtool -s
>>>> Quorum information
>>>> ------------------
>>>> Date: Wed Aug 28 11:29:49 2013
>>>> Quorum provider: corosync_votequorum
>>>> Nodes: 1
>>>> Node ID: 172793107
>>>> Ring ID: 52
>>>> Quorate: No
>>>>
>>>> Votequorum information
>>>> ----------------------
>>>> Expected votes: 3
>>>> Highest expected: 3
>>>> Total votes: 1
>>>> Quorum: 2 Activity blocked
>>>> Flags:
>>>>
>>>> Membership information
>>>> ----------------------
>>>> Nodeid Votes Name
>>>> 172793107 1 dev-cluster2-node4 (local)
>>>>
>>>> # cibadmin -Q
>>>> <cib epoch="25" num_updates="3" admin_epoch="0" validate-with="pacemaker-1.2" crm_feature_set="3.0.7" cib-last-written="Wed Aug 28 11:24:06 2013" update-origin="dev-cluster2-node4" update-client="crmd" have-quorum="0" dc-uuid="172793107">
>>>> <configuration>
>>>> <crm_config>
>>>> <cluster_property_set id="cib-bootstrap-options">
>>>> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.11-1.el6-4f672bc"/>
>>>> <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
>>>> </cluster_property_set>
>>>> </crm_config>
>>>> <nodes>
>>>> <node id="172793107" uname="dev-cluster2-node4"/>
>>>> </nodes>
>>>> <resources/>
>>>> <constraints/>
>>>> </configuration>
>>>> <status>
>>>> <node_state id="172793107" uname="dev-cluster2-node4" in_ccm="true" crmd="online" crm-debug-origin="do_state_transition" join="member" expected="member">
>>>> <lrm id="172793107">
>>>> <lrm_resources/>
>>>> </lrm>
>>>> <transient_attributes id="172793107">
>>>> <instance_attributes id="status-172793107">
>>>> <nvpair id="status-172793107-probe_complete" name="probe_complete" value="true"/>
>>>> </instance_attributes>
>>>> </transient_attributes>
>>>> </node_state>
>>>> </status>
>>>> </cib>
>>>>>> I figured out a way get around this, but it would be easier to do if the CIB has worked as a with CMAN.
>>>>>> I just do not start the main resource if the attribute is not defined or it is not true.
>>>>>> This slightly changes the logic of the cluster.
>>>>>> But I'm not sure what the correct behavior.
>>>>>>
>>>>>> libqb 0.14.4
>>>>>> corosync 2.3.1
>>>>>> pacemaker 1.1.11
>>>>>>
>>>>>> All build from source in previews week.
>>>>>>>> Now in corosync.conf:
>>>>>>>>
>>>>>>>> totem {
>>>>>>>> version: 2
>>>>>>>> crypto_cipher: none
>>>>>>>> crypto_hash: none
>>>>>>>> interface {
>>>>>>>> ringnumber: 0
>>>>>>>> bindnetaddr: 10.76.157.18
>>>>>>>> mcastaddr: 239.94.1.56
>>>>>>>> mcastport: 5405
>>>>>>>> ttl: 1
>>>>>>>> }
>>>>>>>> }
>>>>>>>> logging {
>>>>>>>> fileline: off
>>>>>>>> to_stderr: no
>>>>>>>> to_logfile: yes
>>>>>>>> logfile: /var/log/cluster/corosync.log
>>>>>>>> to_syslog: yes
>>>>>>>> debug: on
>>>>>>>> timestamp: on
>>>>>>>> logger_subsys {
>>>>>>>> subsys: QUORUM
>>>>>>>> debug: on
>>>>>>>> }
>>>>>>>> }
>>>>>>>> quorum {
>>>>>>>> provider: corosync_votequorum
>>>>>>>> }
>>>>>>>> nodelist {
>>>>>>>> node {
>>>>>>>> ring0_addr: 10.76.157.17
>>>>>>>> }
>>>>>>>> node {
>>>>>>>> ring0_addr: 10.76.157.18
>>>>>>>> }
>>>>>>>> node {
>>>>>>>> ring0_addr: 10.76.157.19
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>>
>>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>>> ,
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>> ,
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>> ,
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130830/82d22a21/attachment-0003.sig>
More information about the Pacemaker
mailing list