[Pacemaker] Invisible dependency (at least to me)

Robert Dahlem Robert.Dahlem at gmx.net
Wed Jul 16 14:44:57 UTC 2014


Hi,

my first post on this might have been to complicated. I broke it down to
a test case.

I have four resources: A1, B1, C1 and B2. B1 is a Master/Slave.

The complete group should run on node korfwf01 (preferably) or on node
korfwf02, not on korfwm01, not on korfwm02.

B1:Master depends on A1, C1 depends on B1:Master.
B2 depends on A1 too.

My problem is: if B2 fails, B1 stays Slave/Slave, although there is no
such dependency.

The templates are:

rsc_template template-DUMMY ocf:heartbeat:Dummy \
        op start on-fail="stop" interval="0" \
        op stop on-fail="block" interval="0"
rsc_template template-DUMMY-not-on-korfwf02 \
        ocf:KORDOBA:dummy-not-on-korfwf02 \
        op start on-fail="stop" interval="0" \
        op stop on-fail="block" interval="0"
rsc_template template-MS-Dummy ocf:KORDOBA:CloneDummy \
        op start interval="0" timeout="20" \
        op promote interval="0" timeout="20" \
        op demote interval="0" timeout="20" \
        op notify interval="0" timeout="20" \
        op stop interval="0" timeout="20" \
        op monitor role="Slave" timeout="20" interval="20" \
        op monitor role="Master" timeout="20" interval="10"

ocf:KORDOBA:dummy-not-on-korfwf02 is a copy of ocf:heartbeat:Dummy but
with a modification to make it always fail on node korfwf02:
	if [ `uname -n` = "korfwf02" ] ; then
        	return $OCF_ERR_GENERIC
	fi

ocf:KORDOBA:CloneDummy is derived from ocf:heartbeat:Dummy and expanded
with promote, demote and notify. Same results with real DRBD.

My configuration:

====================================================================
node korfwf01
node korfwf02
node korfwm01
node korfwm02
primitive A1 @template-DUMMY
primitive B1-sub @template-MS-Dummy
ms B1 B1-sub meta master-max=1 master-node-max=1 \
  clone-max=2 clone-node-max=1 notify=true
primitive B2 @template-DUMMY-not-on-korfwf02
primitive C1 @template-DUMMY
location loc-A1-korfwf01 A1 2: korfwf01
location loc-A1-korfwf02 A1 1: korfwf02
location loc-A1-korfwm01 A1 -inf: korfwm01
location loc-A1-korfwm02 A1 -inf: korfwm02
location loc-B1-korfwm01 B1 -inf: korfwm01
location loc-B1-korfwm02 B1 -inf: korfwm02
order ord-A1-before-B1 inf: A1 B1:promote
order ord-A1-before-B2 inf: A1 B2
order ord-B1-before-C1 inf: B1:promote C1:start
colocation coloc-B1-follows-A1 inf: B1:Master A1
colocation coloc-B2-follows-A1 inf: B2 A1
colocation coloc-C1-follows-B1 inf: C1 B1:Master
====================================================================

In the beginning everything is ok:
--------------------------------------------------------------------
# crm status inactive
 A1     (ocf::heartbeat:Dummy): Started korfwf01
 Master/Slave Set: B1 [B1-sub]
     Masters: [ korfwf01 ]
     Slaves: [ korfwf02 ]
 B2     (ocf::KORDOBA:dummy-not-on-korfwf02):   Started korfwf01
 C1     (ocf::heartbeat:Dummy): Started korfwf01
--------------------------------------------------------------------

Now I move the complete group to node korfwf02. B2 will fail, but
nothing depends on it, so that should be the only resource not started. But:
--------------------------------------------------------------------
# crm resource move A1 korfwf02
# crm status inactive
 A1     (ocf::heartbeat:Dummy): Started korfwf02
 Master/Slave Set: B1 [B1-sub]
     Slaves: [ korfwf01 korfwf02 ]
 B2     (ocf::KORDOBA:dummy-not-on-korfwf02):   Stopped
 C1     (ocf::heartbeat:Dummy): Stopped

Failed actions:
    B2_start_0 on korfwf02 'unknown error' (1): call=1602,
status=complete, last-rc-change='Wed Jul 16 15:50:49 2014', queued=0ms,
exec=8ms
--------------------------------------------------------------------

B1 stays Slave/Slave and I do not understand why.

--------------------------------------------------------------------
# ptest -s -L | grep B1
clone_color: B1 allocation score on korfwf01: 0
clone_color: B1 allocation score on korfwf02: 0
clone_color: B1 allocation score on korfwm01: -INFINITY
clone_color: B1 allocation score on korfwm02: -INFINITY
clone_color: B1-sub:0 allocation score on korfwf01: 0
clone_color: B1-sub:0 allocation score on korfwf02: 100
clone_color: B1-sub:0 allocation score on korfwm01: -INFINITY
clone_color: B1-sub:0 allocation score on korfwm02: -INFINITY
clone_color: B1-sub:1 allocation score on korfwf01: 100
clone_color: B1-sub:1 allocation score on korfwf02: 0
clone_color: B1-sub:1 allocation score on korfwm01: -INFINITY
clone_color: B1-sub:1 allocation score on korfwm02: -INFINITY
native_color: B1-sub:0 allocation score on korfwf01: 0
native_color: B1-sub:0 allocation score on korfwf02: 100
native_color: B1-sub:0 allocation score on korfwm01: -INFINITY
native_color: B1-sub:0 allocation score on korfwm02: -INFINITY
native_color: B1-sub:1 allocation score on korfwf01: 100
native_color: B1-sub:1 allocation score on korfwf02: -INFINITY
native_color: B1-sub:1 allocation score on korfwm01: -INFINITY
native_color: B1-sub:1 allocation score on korfwm02: -INFINITY
B1-sub:1 promotion score on korfwf01: -1
B1-sub:0 promotion score on korfwf02: -INFINITY
--------------------------------------------------------------------

Where does
	native_color: B1-sub:1 allocation score on korfwf02: -INFINITY come from?

Could someone please look into this?

Kind regards,
Robert




More information about the Pacemaker mailing list