[Pacemaker] Unique clone instance is stopped too early on move

Sun Apr 26 20:19:05 UTC 2015

> On 17 Apr 2015, at 4:19 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 
> 17.04.2015 00:48, Andrew Beekhof wrote:
>> 
>>> On 22 Jan 2015, at 12:04 am, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>> 
>>> 20.01.2015 02:44, Andrew Beekhof wrote:
>>>> 
>>>>> On 16 Jan 2015, at 3:59 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>>>> 
>>>>> 16.01.2015 07:44, Andrew Beekhof wrote:
>>>>>> 
>>>>>>> On 15 Jan 2015, at 3:11 pm, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>>>>>> 
>>>>>>> 13.01.2015 11:32, Andrei Borzenkov wrote:
>>>>>>>> On Tue, Jan 13, 2015 at 10:20 AM, Vladislav Bogdanov
>>>>>>>> <bubble at hoster-ok.com> wrote:
>>>>>>>>> Hi Andrew, David, all.
>>>>>>>>> 
>>>>>>>>> I found a little bit strange operation ordering during transition execution.
>>>>>>>>> 
>>>>>>>>> Could you please look at the following partial configuration (crmsh syntax)?
>>>>>>>>> 
>>>>>>>>> ===
>>>>>>>>> ...
>>>>>>>>> clone cl-broker broker \
>>>>>>>>>         meta interleave=true target-role=Started
>>>>>>>>> clone cl-broker-vips broker-vips \
>>>>>>>>>         meta clone-node-max=2 globally-unique=true interleave=true resource-stickiness=0 target-role=Started
>>>>>>>>> clone cl-ctdb ctdb \
>>>>>>>>>         meta interleave=true target-role=Started
>>>>>>>>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker
>>>>>>>>> colocation broker-with-ctdb inf: cl-broker cl-ctdb
>>>>>>>>> order broker-after-ctdb inf: cl-ctdb cl-broker
>>>>>>>>> order broker-vips-after-broker 0: cl-broker cl-broker-vips
>>>>>>>>> ...
>>>>>>>>> ===
>>>>>>>>> 
>>>>>>>>> After I put one node to standby and then back to online, I see the following transition (relevant excerpt):
>>>>>>>>> 
>>>>>>>>> ===
>>>>>>>>>  * Pseudo action:   cl-broker-vips_stop_0
>>>>>>>>>  * Resource action: broker-vips:1   stop on c-pa-0
>>>>>>>>>  * Pseudo action:   cl-broker-vips_stopped_0
>>>>>>>>>  * Pseudo action:   cl-ctdb_start_0
>>>>>>>>>  * Resource action: ctdb            start on c-pa-1
>>>>>>>>>  * Pseudo action:   cl-ctdb_running_0
>>>>>>>>>  * Pseudo action:   cl-broker_start_0
>>>>>>>>>  * Resource action: ctdb            monitor=10000 on c-pa-1
>>>>>>>>>  * Resource action: broker          start on c-pa-1
>>>>>>>>>  * Pseudo action:   cl-broker_running_0
>>>>>>>>>  * Pseudo action:   cl-broker-vips_start_0
>>>>>>>>>  * Resource action: broker          monitor=10000 on c-pa-1
>>>>>>>>>  * Resource action: broker-vips:1   start on c-pa-1
>>>>>>>>>  * Pseudo action:   cl-broker-vips_running_0
>>>>>>>>>  * Resource action: broker-vips:1   monitor=30000 on c-pa-1
>>>>>>>>> ===
>>>>>>>>> 
>>>>>>>>> What could be a reason to stop unique clone instance so early for move?
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Do not take it as definitive answer, but cl-broker-vips cannot run
>>>>>>>> unless both other resources are started. So if you compute closure of
>>>>>>>> all required transitions it looks rather logical. Having
>>>>>>>> cl-broker-vips started while broker is still stopped would violate
>>>>>>>> constraint.
>>>>>>> 
>>>>>>> Problem is that broker-vips:1 is stopped on one (source) node unnecessarily early.
>>>>>> 
>>>>>> It looks to be moving from c-pa-0 to c-pa-1
>>>>>> It might be unnecessarily early, but it is what you asked for... we have to unwind the resource stack before we can build it up.
>>>>> 
>>>>> Yes, I understand that it is valid, but could its stop be delayed until cluster is in the state when all dependencies are satisfied to start it on another node (like migration?)?
>>>> 
>>>> No, because "we have to unwind the resource stack before we can build it up."
>>>> Doing anything else would be one of those things that is trivial for a human to identify but rather complex for a computer.
>>> 
>>> I believe there is also an issue with migration of clone instances.
>>> 
>>> I modified pe-input to allow migration of cl-broker-vips (and also set inf score for broker-vips-after-broker
>>> and make cl-broker-vips interleaved).
>>> Relevant part is:
>>> clone cl-broker broker \
>>>        meta interleave=true target-role=Started
>>> clone cl-broker-vips broker-vips \
>>>        meta clone-node-max=2 globally-unique=true interleave=true allow-migrate=true resource-stickiness=0 target-role=Started
>>> clone cl-ctdb ctdb \
>>>        meta interleave=true target-role=Started
>>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker
>>> colocation broker-with-ctdb inf: cl-broker cl-ctdb
>>> order broker-after-ctdb inf: cl-ctdb cl-broker
>>> order broker-vips-after-broker inf: cl-broker cl-broker-vips
>>> 
>>> After that (part of) transition is:
>>> 
>>> * Resource action: broker-vips:1   migrate_to on c-pa-0
>>> * Pseudo action:   cl-broker-vips_stop_0
>>> * Resource action: broker-vips:1   migrate_from on c-pa-1
>>> * Resource action: broker-vips:1   stop on c-pa-0
>>> * Pseudo action:   cl-broker-vips_stopped_0
>>> * Pseudo action:   all_stopped
>>> * Pseudo action:   cl-ctdb_start_0
>>> * Resource action: ctdb            start on c-pa-1
>>> * Pseudo action:   cl-ctdb_running_0
>>> * Pseudo action:   cl-broker_start_0
>>> * Resource action: ctdb            monitor=10000 on c-pa-1
>>> * Resource action: broker          start on c-pa-1
>>> * Pseudo action:   cl-broker_running_0
>>> * Pseudo action:   cl-broker-vips_start_0
>>> * Resource action: broker          monitor=10000 on c-pa-1
>>> * Pseudo action:   broker-vips:1_start_0
>>> * Pseudo action:   cl-broker-vips_running_0
>>> * Resource action: broker-vips:1   monitor=30000 on c-pa-1
>>> 
>>> But, I would say that at least from a human logic PoV the above breaks ordering rule broker-vips-after-broker
>>> (cl-broker-vips finished migrating and thus runs on c-pa-1 before cl-broker started there).
>>> Technically broker-vips:1_start_0 goes at the right position, but actually resource is "started"
>>> in migrate_to/mifrate_from.
>>> 
>>> 
>>> I also went further and injected a pair of non-clone IPAddr2 resources into the same pe-input, and also enabled migration
>>> for them (returning interleave for cl-broker-vips to false and setting ordering score for broker-vips-after-broker back to 0,
>>> so all three order constraints are adjacent):
>>> 
>>> clone cl-broker broker \
>>>        meta interleave=true target-role=Started
>>> clone cl-broker-vips broker-vips \
>>>        meta clone-node-max=2 globally-unique=true interleave=false allow-migrate=true resource-stickiness=0 target-role=Started
>>> clone cl-ctdb ctdb \
>>>        meta interleave=true target-role=Started
>>> primitive broker-vip1 IPaddr2 \
>>>        params ip=192.168.122.70 cidr_netmask=24 nic=eth0 \
>>>        op start interval=0 timeout=20 \
>>>        op stop interval=0 timeout=20 \
>>>        op monitor interval=30
>>> primitive broker-vip2 IPaddr2 \
>>>        params ip=192.168.122.71 cidr_netmask=24 nic=eth0 \
>>>        op start interval=0 timeout=20 \
>>>        op stop interval=0 timeout=20 \
>>>        op monitor interval=30
>>> colocation broker-with-ctdb inf: cl-broker cl-ctdb
>>> colocation broker-vips-with-broker inf: cl-broker-vips cl-broker
>>> colocation broker-vip1-with-broker inf: broker-vip1 cl-broker
>>> colocation broker-vip2-with-broker inf: broker-vip2 cl-broker
>>> colocation broker-vip2-not-with-vip1 -100: broker-vip2 broker-vip1
>>> order broker-after-ctdb inf: cl-ctdb cl-broker
>>> order broker-vips-after-broker 0: cl-broker cl-broker-vips
>>> order broker-vip1-after-broker 0: cl-broker broker-vip1
>>> order broker-vip2-after-broker 0: cl-broker broker-vip2
>>> 
>>> For broker-vip2 I see completely different output (compare with broker-vips:1):
>>> 
>>> * Resource action: broker-vips:1   migrate_to on c-pa-0
>> 
>> I just noticed this, since when does IPaddr2 migrate?
> 
> I just injected allow_migrate for broker-vip1, broker-vip2 and broker-vips into the pe_input to test what would pengine do

The force is strong with this one…

> but forgot to note that (actually cl-broker-vips definition above has it enabled but broker-vip{1,2} misses that, damn, my fault, it should be there too). I need to be more accurate.
> For g-u clone it doesn't solve the issue btw. But for ordinary resource it does. That makes me think that migration paths differ for g-u clone instances.

Highly likely.

> Actually, implementing (pseudo-)migration in IPaddr2 doesn't seem to be very complex task.
> 
>> 
>> Reason I noticed is because broker-vips definitely doesn’t start until the end anymore:
>> 
>>  * Resource action: broker          start on c-pa-1
>>  * Pseudo action:   cl-broker_running_0
>>  * Pseudo action:   cl-broker-vips_start_0
>>  * Resource action: broker          monitor=10000 on c-pa-1
>>  * Resource action: broker-vips:1   start on c-pa-1
> 
> Actually it is migrated at the very beginning of the transition,

Not with rc2 is what I’m saying:

[06:17 AM] beekhof at fedora ~/Development/sources/pacemaker/1.1 ☺ # tools/crm_simulate -Sx ~/Downloads/pe-input-418.bz2  | grep broker-vips
 Clone Set: cl-broker-vips [broker-vips] (unique)
     broker-vips:0	(ocf::heartbeat:IPaddr2):	Started c-pa-0
     broker-vips:1	(ocf::heartbeat:IPaddr2):	Started c-pa-0
 * Move    broker-vips:1	(Started c-pa-0 -> c-pa-1)
 * Pseudo action:   cl-broker-vips_stop_0
 * Resource action: broker-vips:1   stop on c-pa-0
 * Pseudo action:   cl-broker-vips_stopped_0
 * Pseudo action:   cl-broker-vips_start_0
 * Resource action: broker-vips:1   start on c-pa-1
 * Pseudo action:   cl-broker-vips_running_0
 * Resource action: broker-vips:1   monitor=30000 on c-pa-1
 Clone Set: cl-broker-vips [broker-vips] (unique)
     broker-vips:0	(ocf::heartbeat:IPaddr2):	Started c-pa-0
     broker-vips:1	(ocf::heartbeat:IPaddr2):	Started c-pa-1

> and that seems to be a big issue to me, because it breaks ordering (start became a pseudo-action, but actual work should be done in migrate_from which is run before broker start).
> 
>> 
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org