[ClusterLabs] ClusterIP location constraint reappears after reboot

Jeremy Matthews Jeremy.Matthews at genband.com
Thu Feb 18 19:07:19 UTC 2016


Hi,

We're having an issue with our cluster where after a reboot of our system a location constraint reappears for the ClusterIP. This causes a problem, because we have a daemon that checks the cluster state and waits until the ClusterIP is started before it kicks off our application. We didn't have this issue when using an earlier version of pacemaker. Here is the constraint as shown by pcs:

[root at g5se-f3efce cib]# pcs constraint
Location Constraints:
  Resource: ClusterIP
    Disabled on: g5se-f3efce (role: Started)
Ordering Constraints:
Colocation Constraints:

...and here is our cluster status with the ClusterIP being Stopped:

[root at g5se-f3efce cib]# pcs status
Cluster name: cl-g5se-f3efce
Last updated: Thu Feb 18 11:36:01 2016
Last change: Thu Feb 18 10:48:33 2016 via crm_resource on g5se-f3efce
Stack: cman
Current DC: g5se-f3efce - partition with quorum
Version: 1.1.11-97629de
1 Nodes configured
4 Resources configured


Online: [ g5se-f3efce ]

Full list of resources:

sw-ready-g5se-f3efce   (ocf::pacemaker:GBmon): Started g5se-f3efce
meta-data      (ocf::pacemaker:GBmon): Started g5se-f3efce
netmon (ocf::heartbeat:ethmonitor):    Started g5se-f3efce
ClusterIP      (ocf::heartbeat:IPaddr2):       Stopped


The cluster really just has one node at this time.

I retrieve the constraint ID, remove the constraint, verify that ClusterIP is started,  and then reboot:

[root at g5se-f3efce cib]# pcs constraint ref ClusterIP
Resource: ClusterIP
  cli-ban-ClusterIP-on-g5se-f3efce
[root at g5se-f3efce cib]# pcs constraint remove cli-ban-ClusterIP-on-g5se-f3efce

[root at g5se-f3efce cib]# pcs status
Cluster name: cl-g5se-f3efce
Last updated: Thu Feb 18 11:45:09 2016
Last change: Thu Feb 18 11:44:53 2016 via crm_resource on g5se-f3efce
Stack: cman
Current DC: g5se-f3efce - partition with quorum
Version: 1.1.11-97629de
1 Nodes configured
4 Resources configured


Online: [ g5se-f3efce ]

Full list of resources:

sw-ready-g5se-f3efce   (ocf::pacemaker:GBmon): Started g5se-f3efce
meta-data      (ocf::pacemaker:GBmon): Started g5se-f3efce
netmon (ocf::heartbeat:ethmonitor):    Started g5se-f3efce
ClusterIP      (ocf::heartbeat:IPaddr2):       Started g5se-f3efce


[root at g5se-f3efce cib]# reboot

....after reboot, log in, and the constraint is back and ClusterIP has not started.


I have noticed in /var/lib/pacemaker/cib that the cib-x.raw files get created when there are changes to the cib (cib.xml). After a reboot, I see the constraint being added in a diff between .raw files:

[root at g5se-f3efce cib]# diff cib-7.raw cib-8.raw
1c1
< <cib epoch="239" num_updates="0" admin_epoch="0" validate-with="pacemaker-1.2" cib-last-written="Thu Feb 18 11:44:53 2016" update-origin="g5se-f3efce" update-client="crm_resource" crm_feature_set="3.0.9" have-quorum="1" dc-uuid="g5se-f3efce">
---
> <cib epoch="240" num_updates="0" admin_epoch="0" validate-with="pacemaker-1.2" cib-last-written="Thu Feb 18 11:46:49 2016" update-origin="g5se-f3efce" update-client="crm_resource" crm_feature_set="3.0.9" have-quorum="1" dc-uuid="g5se-f3efce">
50c50,52
<     <constraints/>
---
>     <constraints>
>       <rsc_location id="cli-ban-ClusterIP-on-g5se-f3efce" rsc="ClusterIP" role="Started" node="g5se-f3efce" score="-INFINITY"/>
>     </constraints>


I have also looked in /var/log/cluster/corosync.log and seen logs where it seems the cib is getting updated. I'm not sure if the constraint is being put back in at shutdown or at start up. I just don't understand why it's being put back in. I don't think our daemon code or other scripts are doing this,  but it is something I could verify.

********************************

>From "yum info pacemaker", my current version is:

Name        : pacemaker
Arch        : x86_64
Version     : 1.1.12
Release     : 8.el6_7.2

My earlier version was:

Name        : pacemaker
Arch        : x86_64
Version     : 1.1.10
Release     : 1.el6_4.4

I'm still using an earlier version pcs, because the new one seems to have issues with python:

Name        : pcs
Arch        : noarch
Version     : 0.9.90
Release     : 1.0.1.el6.centos

*******************************

If anyone has ideas on the cause or thoughts on this, anything would be greatly appreciated.

Thanks!



Jeremy Matthews


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160218/8a4b99fd/attachment-0003.html>


More information about the Users mailing list