[ClusterLabs] Node attribute disappears when pacemaker is started

Ken Gaillot kgaillot at redhat.com
Thu Jun 8 10:42:53 EDT 2017


Hi,

Looking at the incident around May 26 16:40:00, here is what happens:

You are setting the attribute for rhel73-2 from rhel73-1, while rhel73-2
is not part of cluster from rhel73-1's point of view.

The crm shell sets the node attribute for rhel73-2 with a CIB
modification that starts like this:

++ /cib/configuration/nodes:  <node uname="rhel73-2" id="rhel73-2"/>

Note that the node ID is the same as its name. The CIB accepts the
change (because you might be adding the proper node later). The crmd
knows that this is not currently valid:

May 26 16:39:39 rhel73-1 crmd[2908]:   error: Invalid node id: rhel73-2

When rhel73-2 joins the cluster, rhel73-1 learns its node ID, and it
removes the existing (invalid) rhel73-2 entry, including its attributes,
because it assumes that the entry is for an older node that has been
removed.

I believe attributes can be set for a node that's not in the cluster
only if the node IDs are specified explicitly in corosync.conf.

You may want to mention the issue to the crm shell developers. It should
probably at least warn if the node isn't known.


On 05/31/2017 09:35 PM, 井上 和徳 wrote:
> Hi Ken,
> 
> I'm sorry. Attachment size was too large.
> I attached it to GitHub, so look at it.
> https://github.com/inouekazu/pcmk_report/blob/master/pcmk-Fri-26-May-2017.tar.bz2
> 
>> -----Original Message-----
>> From: Ken Gaillot [mailto:kgaillot at redhat.com]
>> Sent: Thursday, June 01, 2017 8:43 AM
>> To: users at clusterlabs.org
>> Subject: Re: [ClusterLabs] Node attribute disappears when pacemaker is started
>>
>> On 05/26/2017 03:21 AM, 井上 和徳 wrote:
>>> Hi Ken,
>>>
>>> I got crm_report.
>>>
>>> Regards,
>>> Kazunori INOUE
>>
>> I don't think it attached -- my mail client says it's 0 bytes.
>>
>>>> -----Original Message-----
>>>> From: Ken Gaillot [mailto:kgaillot at redhat.com]
>>>> Sent: Friday, May 26, 2017 4:23 AM
>>>> To: users at clusterlabs.org
>>>> Subject: Re: [ClusterLabs] Node attribute disappears when pacemaker is started
>>>>
>>>> On 05/24/2017 05:13 AM, 井上 和徳 wrote:
>>>>> Hi,
>>>>>
>>>>> After loading the node attribute, when I start pacemaker of that node, the attribute disappears.
>>>>>
>>>>> 1. Start pacemaker on node1.
>>>>> 2. Load configure containing node attribute of node2.
>>>>>    (I use multicast addresses in corosync, so did not set "nodelist {nodeid: }" in corosync.conf.)
>>>>> 3. Start pacemaker on node2, the node attribute that should have been load disappears.
>>>>>    Is this specifications ?
>>>>
>>>> Hi,
>>>>
>>>> No, this should not happen for a permanent node attribute.
>>>>
>>>> Transient node attributes (status-attr in crm shell) are erased when the
>>>> node starts, so it would be expected in that case.
>>>>
>>>> I haven't been able to reproduce this with a permanent node attribute.
>>>> Can you attach logs from both nodes around the time node2 is started?
>>>>
>>>>>
>>>>> 1.
>>>>> [root at rhel73-1 ~]# systemctl start corosync;systemctl start pacemaker
>>>>> [root at rhel73-1 ~]# crm configure show
>>>>> node 3232261507: rhel73-1
>>>>> property cib-bootstrap-options: \
>>>>>   have-watchdog=false \
>>>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>>>   cluster-infrastructure=corosync
>>>>>
>>>>> 2.
>>>>> [root at rhel73-1 ~]# cat rhel73-2.crm
>>>>> node rhel73-2 \
>>>>>   utilization capacity="2" \
>>>>>   attributes attrname="attr2"
>>>>>
>>>>> [root at rhel73-1 ~]# crm configure load update rhel73-2.crm
>>>>> [root at rhel73-1 ~]# crm configure show
>>>>> node 3232261507: rhel73-1
>>>>> node rhel73-2 \
>>>>>   utilization capacity=2 \
>>>>>   attributes attrname=attr2
>>>>> property cib-bootstrap-options: \
>>>>>   have-watchdog=false \
>>>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>>>   cluster-infrastructure=corosync
>>>>>
>>>>> 3.
>>>>> [root at rhel73-1 ~]# ssh rhel73-2 'systemctl start corosync;systemctl start pacemaker'
>>>>> [root at rhel73-1 ~]# crm configure show
>>>>> node 3232261507: rhel73-1
>>>>> node 3232261508: rhel73-2
>>>>> property cib-bootstrap-options: \
>>>>>   have-watchdog=false \
>>>>>   dc-version=1.1.17-0.1.rc2.el7-524251c \
>>>>>   cluster-infrastructure=corosync
>>>>>
>>>>> Regards,
>>>>> Kazunori INOUE




More information about the Users mailing list