[ClusterLabs] PCMK_node_start_state=standby sometimes does not work
井上 和徳
inouekazu at intellilink.co.jp
Tue Nov 28 04:36:52 EST 2017
Hi,
Sometimes a node with 'PCMK_node_start_state=standby' will start up Online.
[ reproduction scenario ]
* Set 'PCMK_node_start_state=standby' to /etc/sysconfig/pacemaker.
* Delete cib (/var/lib/pacemaker/cib/*).
* Start pacemaker at the same time on 2 nodes.
# for i in rhel74-1 rhel74-3 ; do ssh -f $i systemctl start pacemaker ; done
[ actual result ]
* crm_mon
Stack: corosync
Current DC: rhel74-3 (version 1.1.18-2b07d5c) - partition with quorum
Last change: Wed Nov 22 06:22:50 2017 by hacluster via crmd on rhel74-3
2 nodes configured
0 resources configured
Node rhel74-3: standby
Online: [ rhel74-1 ]
* cib.xml
<nodes>
<node id="3232261507" uname="rhel74-1"/>
<node id="3232261509" uname="rhel74-3">
<instance_attributes id="nodes-3232261509">
<nvpair id="nodes-3232261509-standby" name="standby" value="on"/>
</instance_attributes>
</node>
</nodes>
* pacemaker.log
Nov 22 06:22:50 [20755] rhel74-1 crmd: (cib_native.c:462 ) warning: cib_native_perform_op_delegate: Call failed: No such device or address
Nov 22 06:22:50 [20755] rhel74-1 crmd: ( cib_attrs.c:320 ) info: update_attr_delegate: Update <node id="3232261507">
Nov 22 06:22:50 [20755] rhel74-1 crmd: ( cib_attrs.c:320 ) info: update_attr_delegate: Update <instance_attributes id="nodes-3232261507">
Nov 22 06:22:50 [20755] rhel74-1 crmd: ( cib_attrs.c:320 ) info: update_attr_delegate: Update <nvpair id="nodes-3232261507-standby" name="standby" value="on"/>
Nov 22 06:22:50 [20755] rhel74-1 crmd: ( cib_attrs.c:320 ) info: update_attr_delegate: Update </instance_attributes>
Nov 22 06:22:50 [20755] rhel74-1 crmd: ( cib_attrs.c:320 ) info: update_attr_delegate: Update </node>
* I attached crm_report to GitHub (too big to attach to this email), so look at it.
https://github.com/inouekazu/pcmk_report/blob/master/pcmk-Wed-22-Nov-2017.tar.bz2
I think that the additional timing of <node id="3232261507">*1 and <instance_attributes id="nodes-3232261507">*2 is the cause.
*1 <node id="3232261507" uname="rhel74-1"/>'
*2 <instance_attributes id="nodes-3232261507">
<nvpair id="nodes-3232261507-standby" name="standby" value="on"/>
I expect to be fixed, but if it's difficult, I have two questions.
1) Does this only occur if there is no cib.xml (in other words, there is no <node> element)?
2) Is there any workaround other than "Do not start at the same time"?
Best Regards
More information about the Users
mailing list