[Pacemaker] BUG in master scores: negative master score leads to _stop_ of instance - why?

Lars Ellenberg lars.ellenberg at linbit.com
Mon Oct 12 09:04:32 UTC 2009


On Mon, Oct 12, 2009 at 09:39:58AM +0200, Andrew Beekhof wrote:
> On Thu, Oct 1, 2009 at 5:46 PM, Lars Ellenberg
> <lars.ellenberg at linbit.com> wrote:
> > On Thu, Oct 01, 2009 at 05:45:30PM +0200, Lars Ellenberg wrote:
> >>
> >> attached is a full cibadmin -Q.
> >
> > I hate it. grmblf.
> >
> >> there is exactly one master slave configured,
> >> primitive is the dummy Stateful agent.
> >>
> >> current cluster state is "stable",
> >> one Master, one Slave, all well.
> >>
> >> if I now change the master score of the slave to -2 (or -5, or
> >> -INFINITY), ptest complains about "cannot run anywhere",
> >> and stopps it!
> >> (which is also what we see when actually using it,
> >>  whether with that dummy Stateful RA, or with DRBD).
> >>
> >> I was told to use negative master scores to prevent promotion,
> >> and I'd like to do so.  But apparently that is a no-go.
> >>
> >> Why would a negative master score cause a slave to be stopped?
> 
> Looks like something thats been fixed since 1.0.5
> I get:
> 
> [09:33 AM] beekhof at mobile ~/Development/pacemaker/stable-1.0 #
> pengine/ptest -VVVV --xml-file ~/Downloads/Stateful-bug.xml -s -D
> foo.dot
> Allocation scores:
> ptest[12205]: 2009/10/12_09:33:48 notice: unpack_config: On loss of
> CCM Quorum: Ignore
> ptest[12205]: 2009/10/12_09:33:48 info: unpack_config: Node scores:
> 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
> ptest[12205]: 2009/10/12_09:33:48 info: determine_online_status: Node
> sles11-b is online
> ptest[12205]: 2009/10/12_09:33:48 info: determine_online_status: Node
> sles11-a is online
> ptest[12205]: 2009/10/12_09:33:48 notice: clone_print: Master/Slave
> Set: ms_res_Stateful_1
> ptest[12205]: 2009/10/12_09:33:48 notice: print_list: 	Masters: [ sles11-a ]
> ptest[12205]: 2009/10/12_09:33:48 notice: print_list: 	Slaves: [ sles11-b ]
> clone_color: ms_res_Stateful_1 allocation score on sles11-a: 0
> clone_color: ms_res_Stateful_1 allocation score on sles11-b: 0
> clone_color: res_Stateful_1:0 allocation score on sles11-a: 11
> clone_color: res_Stateful_1:0 allocation score on sles11-b: 0
> clone_color: res_Stateful_1:1 allocation score on sles11-a: 0
> clone_color: res_Stateful_1:1 allocation score on sles11-b: 6
> native_color: res_Stateful_1:0 allocation score on sles11-a: 11
> native_color: res_Stateful_1:0 allocation score on sles11-b: 0
> native_color: res_Stateful_1:1 allocation score on sles11-a: -1000000
> native_color: res_Stateful_1:1 allocation score on sles11-b: 6
> res_Stateful_1:0 promotion score on sles11-a: 10
> ptest[12205]: 2009/10/12_09:33:48 info: master_color: Promoting
> res_Stateful_1:0 (Master sles11-a)
> res_Stateful_1:1 promotion score on sles11-b: 5


That is perfectly fine.
The provided xml describing exactly this stable state.

Now you go, and change the ="5" to a ="-5" there,
and see what ptest tells you about actions that would trigger.

> ptest[12205]: 2009/10/12_09:33:48 info: master_color:
> ms_res_Stateful_1: Promoted 1 instances of a possible 1 to master
> ptest[12205]: 2009/10/12_09:33:48 notice: LogActions: Leave resource
> res_Stateful_1:0	(Master sles11-a)
> ptest[12205]: 2009/10/12_09:33:48 notice: LogActions: Leave resource
> res_Stateful_1:1	(Slave sles11-b)
> 
> How does that compare to your output?

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.




More information about the Pacemaker mailing list