[Pacemaker] resource moving unnecessarily due to ping race condition

Fri Sep 23 07:53:53 EDT 2011

Yes, but the patch only affects the pingd attribute. And we do not want 
the other node to be able to challenge us to an immediate score 
comparison. That is the whole idea behind the fping OCF resource agent 
we are using, to give the timing advantage to the node currently running 
the resource by delaying rising scores on the idle, and falling scores 
on the active node.

On 09/22/2011 09:10 PM, Andrew Beekhof wrote:
> On Tue, Sep 20, 2011 at 10:34 PM, Brad Johnson<bjohnson at ecessa.com>  wrote:
>> It is not necessarily the case that the outside world can't reach the
>> cluster. Ours is a multi-homed device connecting to multiple WANs and LANs.
>> We want the device with the best connectivity to be the active device. To
>> get around the problem of failovers occurring when a ping node reboots for
>> example, I have written an fping OCF RA that uses different dampening delays
>> based on if it is running on the active or idle device. I have also patched
>> pacemaker attrd.c to fix it so it doesn't send an immediate update when it
>> receives a flush message from the other node. This was causing it to ignore
>> any running delay timer.
> Thats the point of the flush message though.  So that all nodes write
> their current value at the same time.
>
>> Here is that patch:
>>
>> --- tools/attrd.orig.c    2011-09-13 08:29:46.946820348 -0500
>> +++ tools/attrd.c    2011-09-14 13:33:59.606894754 -0500
>> @@ -348,10 +348,14 @@
>>          attrd_local_callback(xml);
>>
>>      } else if(ignore == NULL || safe_str_neq(from, attrd_uname)) {
>> +        const char *attr  = crm_element_value(xml, F_ATTRD_ATTRIBUTE);
>> +        /* Don't send update for score if msg is from other node */
>> +        if(safe_str_eq(from, attrd_uname) || safe_str_neq(attr, "pingd")) {
>>          crm_info("%s message from %s", op, from);
>>          hash_entry = find_hash_entry(xml);
>>          stop_attrd_timer(hash_entry);
>>          attrd_perform_update(hash_entry);
>> +        }
>>      }
>>      free_xml(xml);
>>   }
>>
>>
>> On 09/19/2011 10:51 PM, Andrew Beekhof wrote:
>>> On Sun, Sep 11, 2011 at 2:30 AM, Vadym Chepkov<vchepkov at gmail.com>    wrote:
>>>> On Sep 8, 2011, at 3:40 PM, Florian Haas wrote:
>>>>
>>>>>>> On 09/08/11 20:59, Brad Johnson wrote:
>>>>>>>> We have a 2 node cluster with a single resource. The resource must
>>>>>>>> run
>>>>>>>> on only a single node at one time. Using the pacemaker:ocf:ping RA we
>>>>>>>> are pinging a WAN gateway and a LAN host on each node so the resource
>>>>>>>> runs on the node with the greatest connectivity. The problem is when
>>>>>>>> a
>>>>>>>> ping host goes down (so both nodes lose connectivity to it), the
>>>>>>>> resource moves to the other node due to timing differences in how
>>>>>>>> fast
>>>>>>>> they update the score attribute. The dampening value has no effect,
>>>>>>>> since it delays both nodes by the same amount. These unnecessary
>>>>>>>> fail-overs aren't acceptable since they are disruptive to the network
>>>>>>>> for no reason.
>>>>>>>> Is there a way to dampen the ping update by different amounts on the
>>>>>>>> active and passive nodes? Or some other way to configure the cluster
>>>>>>>> to
>>>>>>>> try to keep the resource where it is during these tie score
>>>>>>>> scenarios?
>>>>> location pingd-constraint group_1 \
>>>>>   rule $id="pingd-constraint-rule" pingd: defined pingd
>>>>>
>>>>> May I suggest that you simply change this constraint to
>>>>>
>>>>> location pingd-constraint group_1 \
>>>>>   rule $id="pingd-constraint-rule" \
>>>>>     -inf: not_defined pingd or pingd lte 0
>>>>>
>>>>> That way, only a host that definitely has _no_ connectivity carries a
>>>>> -INF score for that resource group. And I believe that is what you
>>>>> really want, rather than take the actual ping score as a placement
>>>>> weight (your "best connectivity" approach).
>>>>>
>>>>> Just my 2 cents, though.
>>>>>
>>>> Even though this approach was recommended many times, there is a problem
>>>> with it.
>>>> What if all nodes for some reason are not able to ping ?
>>>> This rule would cause a resource to be brought down completely, whereas
>>>> if you use "best connectivity" approach it will stay up where it was before
>>>> network failed.
>>> If the outside[1] world can't reach the cluster, is there much benefit
>>> in having it running?
>>>
>>> [1] Substitute "outside" for wherever your users are, hopefully you
>>> picked a ping node from the same area.
>>>
>>>> Vadym
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs:
>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs:
>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:
>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker