[Pacemaker] resource moving unnecessarily due to ping race condition

Mon Sep 26 01:58:29 UTC 2011

On Fri, Sep 23, 2011 at 9:53 PM, Brad Johnson <bjohnson at ecessa.com> wrote:
> Yes, but the patch only affects the pingd attribute.

Use of the name 'pingd' isnt mandatory though.

> And we do not want the
> other node to be able to challenge us to an immediate score comparison. That
> is the whole idea behind the fping OCF resource agent we are using, to give
> the timing advantage to the node currently running the resource by delaying
> rising scores on the idle, and falling scores on the active node.

Why not just set dampen=0?

>
> On 09/22/2011 09:10 PM, Andrew Beekhof wrote:
>>
>> On Tue, Sep 20, 2011 at 10:34 PM, Brad Johnson<bjohnson at ecessa.com>
>>  wrote:
>>>
>>> It is not necessarily the case that the outside world can't reach the
>>> cluster. Ours is a multi-homed device connecting to multiple WANs and
>>> LANs.
>>> We want the device with the best connectivity to be the active device. To
>>> get around the problem of failovers occurring when a ping node reboots
>>> for
>>> example, I have written an fping OCF RA that uses different dampening
>>> delays
>>> based on if it is running on the active or idle device. I have also
>>> patched
>>> pacemaker attrd.c to fix it so it doesn't send an immediate update when
>>> it
>>> receives a flush message from the other node. This was causing it to
>>> ignore
>>> any running delay timer.
>>
>> Thats the point of the flush message though.  So that all nodes write
>> their current value at the same time.
>>
>>> Here is that patch:
>>>
>>> --- tools/attrd.orig.c    2011-09-13 08:29:46.946820348 -0500
>>> +++ tools/attrd.c    2011-09-14 13:33:59.606894754 -0500
>>> @@ -348,10 +348,14 @@
>>>         attrd_local_callback(xml);
>>>
>>>     } else if(ignore == NULL || safe_str_neq(from, attrd_uname)) {
>>> +        const char *attr  = crm_element_value(xml, F_ATTRD_ATTRIBUTE);
>>> +        /* Don't send update for score if msg is from other node */
>>> +        if(safe_str_eq(from, attrd_uname) || safe_str_neq(attr,
>>> "pingd")) {
>>>         crm_info("%s message from %s", op, from);
>>>         hash_entry = find_hash_entry(xml);
>>>         stop_attrd_timer(hash_entry);
>>>         attrd_perform_update(hash_entry);
>>> +        }
>>>     }
>>>     free_xml(xml);
>>>  }
>>>
>>>
>>> On 09/19/2011 10:51 PM, Andrew Beekhof wrote:
>>>>
>>>> On Sun, Sep 11, 2011 at 2:30 AM, Vadym Chepkov<vchepkov at gmail.com>
>>>>  wrote:
>>>>>
>>>>> On Sep 8, 2011, at 3:40 PM, Florian Haas wrote:
>>>>>
>>>>>>>> On 09/08/11 20:59, Brad Johnson wrote:
>>>>>>>>>
>>>>>>>>> We have a 2 node cluster with a single resource. The resource must
>>>>>>>>> run
>>>>>>>>> on only a single node at one time. Using the pacemaker:ocf:ping RA
>>>>>>>>> we
>>>>>>>>> are pinging a WAN gateway and a LAN host on each node so the
>>>>>>>>> resource
>>>>>>>>> runs on the node with the greatest connectivity. The problem is
>>>>>>>>> when
>>>>>>>>> a
>>>>>>>>> ping host goes down (so both nodes lose connectivity to it), the
>>>>>>>>> resource moves to the other node due to timing differences in how
>>>>>>>>> fast
>>>>>>>>> they update the score attribute. The dampening value has no effect,
>>>>>>>>> since it delays both nodes by the same amount. These unnecessary
>>>>>>>>> fail-overs aren't acceptable since they are disruptive to the
>>>>>>>>> network
>>>>>>>>> for no reason.
>>>>>>>>> Is there a way to dampen the ping update by different amounts on
>>>>>>>>> the
>>>>>>>>> active and passive nodes? Or some other way to configure the
>>>>>>>>> cluster
>>>>>>>>> to
>>>>>>>>> try to keep the resource where it is during these tie score
>>>>>>>>> scenarios?
>>>>>>
>>>>>> location pingd-constraint group_1 \
>>>>>>  rule $id="pingd-constraint-rule" pingd: defined pingd
>>>>>>
>>>>>> May I suggest that you simply change this constraint to
>>>>>>
>>>>>> location pingd-constraint group_1 \
>>>>>>  rule $id="pingd-constraint-rule" \
>>>>>>    -inf: not_defined pingd or pingd lte 0
>>>>>>
>>>>>> That way, only a host that definitely has _no_ connectivity carries a
>>>>>> -INF score for that resource group. And I believe that is what you
>>>>>> really want, rather than take the actual ping score as a placement
>>>>>> weight (your "best connectivity" approach).
>>>>>>
>>>>>> Just my 2 cents, though.
>>>>>>
>>>>> Even though this approach was recommended many times, there is a
>>>>> problem
>>>>> with it.
>>>>> What if all nodes for some reason are not able to ping ?
>>>>> This rule would cause a resource to be brought down completely, whereas
>>>>> if you use "best connectivity" approach it will stay up where it was
>>>>> before
>>>>> network failed.
>>>>
>>>> If the outside[1] world can't reach the cluster, is there much benefit
>>>> in having it running?
>>>>
>>>> [1] Substitute "outside" for wherever your users are, hopefully you
>>>> picked a ping node from the same area.
>>>>
>>>>> Vadym
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs:
>>>>>
>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs:
>>>>
>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs:
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>