[Pacemaker] One more globally-unique clone question

Mon Feb 23 17:58:38 EST 2015

> On 21 Jan 2015, at 5:08 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 
> 21.01.2015 03:51, Andrew Beekhof wrote:
>> 
>>> On 20 Jan 2015, at 4:13 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>> 
>>> 20.01.2015 02:47, Andrew Beekhof wrote:
>>>> 
>>>>> On 17 Jan 2015, at 1:25 am, Vladislav Bogdanov
>>>>> <bubble at hoster-ok.com> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> Trying to reproduce problem with early stop of globally-unique
>>>>> clone instances during move to another node I found one more
>>>>> "interesting" problem.
>>>>> 
>>>>> Due to the different order of resources in the CIB and extensive
>>>>> use of constraints between other resources (odd number of resources
>>>>> cluster-wide) two CLUSTERIP instances are always allocated to the
>>>>> same node in the new testing cluster.
>>>> 
>>>> Ah, so this is why broker-vips:1 was moving.
>>> 
>>> That are two different 2-node clusters with different order of resources.
>>> In the first one broker-vips go after even number of resources, and one instance wants to return to a "mother-node" after it is brought back online, thus broker-vips:1 is moving.
>>> 
>>> In the second one, broker-vips go after odd number of resources (actually three more resources are allocated to one node due to constraints) and both boker-vips go to another node.
>>> 
>>>> 
>>>>> 
>>>>> What would be the best/preferred way to make them run on different
>>>>> nodes by default?
>>>> 
>>>> By default they will. I'm assuming its the constraints that are
>>>> preventing this.
>>> 
>>> I only see that they are allocated similar to any other resources.
>> 
>> Are they allocated in stages though?
>> Ie. Was there a point at which the "mother-node" was available but constraints prevented broker-vips:1 running there?
> 
> There are three pe-inputs for the node start.
> First one starts fence device for the other node, dlm+clvm+gfs and drbd on the online-back node.
> Second one tries to start/promote/move everything else until it is interrupted (by the drbd RA?).
> Third one finishes that attempt.

I've lost all context on this and I don't seem to be able to reconstruct it :)
Which part of the above is the problem?

> 
> And yes, CTDB depends on GFS2 filesystem, so broker-vips:1 can't be allocated immediately due to constraints. It is allocated in the second pe-input.
> 
> May be it is worth sending crm-report to you in order to not overload list by long listings and you have complete information?
> 
>> 
>>> 
>>>> 
>>>> Getting them to auto-rebalance is the harder problem
>>> 
>>> I see. Should it be possible to solve it without priority or utilization use?
>> 
>> "it" meaning auto-rebalancing or your original issue?
> 
> I meant auto-rebalancing.

It should be something we handle internally.
I've made a note of it.

> 
> 
>> 
>>> 
>>>> 
>>>>> 
>>>>> I see following options:
>>>>> * Raise priority of globally-unique clone so its instances are
>>>>> always allocated first of all.
>>>>> * Use utilization attributes (with high values for nodes and low values
>>>>> for cluster resources).
>>>>> * Anything else?
>>>>> 
>>>>> If I configure virtual IPs one-by-one (without clone), I can add a
>>>>> colocation constraint with negative score between them. I do not
>>>>> see a way to scale that setup well though (5-10 IPs). So, what
>>>>> would be the best option to achieve the same with globally-unique
>>>>> cloned resource? May be there should be some internal
>>>>> preference/colocation not to place them together (like default
>>>>> stickiness=1 for clones)? Or even allow special negative colocation
>>>>> constraint and the same resource in both 'what' and 'with'
>>>>> (colocation col1 -1: clone clone)?
>>>>> 
>>>>> Best, Vladislav
>>>>> 
>>>>> 
>>>>> _______________________________________________ Pacemaker mailing
>>>>> list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>> 
>>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs:
>>>>> http://bugs.clusterlabs.org
>>>> 
>>>> 
>>>> _______________________________________________ Pacemaker mailing
>>>> list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>> 
>>>> Project Home: http://www.clusterlabs.org Getting started:
>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs:
>>>> http://bugs.clusterlabs.org
>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org