[Pacemaker] Promote of one resource leads to start of another resource in heartbeat cluster

Mon Mar 19 15:23:55 UTC 2012

----- Original Message ----- 

> From: "neha chatrath" <nehachatrath at gmail.com>
> To: pacemaker at oss.clusterlabs.org
> Sent: Monday, March 19, 2012 9:10:21 AM
> Subject: [Pacemaker] Promote of one resource leads to start of
> another resource in heartbeat cluster

> Hello,
> I have the following 2 node cluster configuration:

> "node $id="15f8a22d-9b1a-4ce3-bca2-05f654a9ed6a" cps2 \
> attributes standby="off"
> node $id="d3088454-5ff3-4bcd-b94c-5a2567e2759b" cps1 \
> attributes standby="off"
> primitive CPS ocf:heartbeat:jboss_cps \
> params jboss_home="/home/cluster/cps/ jboss-5.1.0.GA/ "
> java_home="/usr/" run_opts="-c all -b 0.0.0.0 -g clusterCPS
> -Djboss.service.binding.set=ports-01
> -Djboss.messaging.ServerPeerID=01" statusurl=" http://127.0.0.1:8180
> " shutdown_opts="-s 127.0.0.1:1199 " pstring="clusterCPS" \
> op start interval="0" timeout="150" \
> op stop interval="0" timeout="240" \
> op monitor interval="30s" timeout="40s"
> primitive ClusterIP ocf:heartbeat:IPaddr2 \
> params ip="192.168.114.150" cidr_netmask="32" nic="bond0:114:1" \
> op monitor interval="40" timeout="20" \
> meta target-role="Started"
> primitive EMS ocf:heartbeat:jboss \
> params jboss_home="/home/cluster/cps/Jboss_EMS/ jboss-5.1.0.GA "
> java_home="/usr/" run_opts="-c all -b 0.0.0.0 -g clusterEMS"
> pstring="clusterEMS" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="30s" timeout="40s"
> primitive LB ocf:ptt:lb_ptt \
> op monitor interval="40"
> primitive NDB_MGMT ocf:ptt:NDB_MGM_RA \
> op monitor interval="120" timeout="120"
> primitive NDB_VIP ocf:heartbeat:IPaddr2 \
> params ip="192.168.117.150" cidr_netmask="255.255.255.255"
> nic="bond0.117:4" \
> op monitor interval="30" timeout="25"
> primitive Rmgr ocf:ptt:RM_RA \
> op monitor interval="60" role="Master" timeout="30" on-fail="restart"
> \
> op monitor interval="40" role="Slave" timeout="40" on-fail="restart"
> \
> op start interval="0" role="Master" timeout="30" \
> op start interval="0" role="Slave" timeout="35"
> primitive mysql ocf:ptt:MYSQLD_RA \
> op monitor interval="180" timeout="200" \
> op start interval="0" timeout="40"
> primitive ndbd ocf:ptt:NDBD_RA \
> op monitor interval="120" timeout="120"
> ms CPS_CLONE CPS \

Is this a typo?  Shouldn't it be clone not ms?

> meta master-max="1" master-max-node="1" clone-max="2"
> clone-node-max="1" interleave="true" notify="true"
> ms ms_Rmgr Rmgr \
> meta master-max="1" master-max-node="1" clone-max="2"
> clone-node-max="1" interleave="true" notify="true"
> target-role="Started"
> ms ms_mysqld mysql \
> meta master-max="1" master-max-node="1" clone-max="2"
> clone-node-max="1" interleave="true" notify="true"
> clone EMS_CLONE EMS \
> meta globally-unique="false" clone-max="2" clone-node-max="1"
> clone LB_CLONE LB \
> meta globally-unique="false" clone-max="2" clone-node-max="1"
> target-role="Started"
> clone ndbdclone ndbd \
> meta globally-unique="false" clone-max="2" clone-node-max="1"
> colocation RM_with_ip inf: ms_Rmgr:Master ClusterIP
> colocation ndb_vip-with-ndb_mgm inf: NDB_MGMT NDB_VIP
> order RM-after-ip inf: ClusterIP ms_Rmgr

order statements will default all resources to the same action as the first so this is equivalent to:
order RM-after-ip inf: ClusterIP:start ms_Rmgr:start

However you don't want the ms resource to start you want it to promote.  If the action of the resource is different inside the order statement you must explicitly define for all like this:
order RM-after-ip inf: ClusterIP:start ms_Rmgr:promote

Fix all the order statements that have ms (or mixed action types) and see if that clears it up.

> order cps-after-mysqld inf: ms_mysqld CPS_CLONE

same here - order cps-after-mysqld inf: ms_mysqld:promote CPS_CLONE:start

HTH
Jake

> order ip-after-mysqld inf: ms_mysqld ClusterIP
> order lb-after-cps inf: CPS_CLONE LB_CLONE
> order mysqld-after-ndbd inf: ndbdclone ms_mysqld
> order ndb_mgm-after-ndb_vip inf: NDB_VIP NDB_MGMT
> order ndbd-after-ndb_mgm inf: NDB_MGMT ndbdclone
> property $id="cib-bootstrap-options" \
> dc-version="1.0.11-9af47ddebcad19e35a61b2a20301dc038018e8e8" \
> cluster-infrastructure="Heartbeat" \
> no-quorum-policy="ignore" \
> stonith-enabled="false"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="100" \
> migration_threshold="3"
> "
> When I brig down the active node in the cluster, ms_mysqld resource
> on the standby node is promoted but another resource (ms_Rmgr) gets
> re-started.

<snip>