[Pacemaker] resource moving unnecessarily due to ping race condition

Tue Sep 20 08:34:43 EDT 2011

It is not necessarily the case that the outside world can't reach the 
cluster. Ours is a multi-homed device connecting to multiple WANs and 
LANs. We want the device with the best connectivity to be the active 
device. To get around the problem of failovers occurring when a ping 
node reboots for example, I have written an fping OCF RA that uses 
different dampening delays based on if it is running on the active or 
idle device. I have also patched pacemaker attrd.c to fix it so it 
doesn't send an immediate update when it receives a flush message from 
the other node. This was causing it to ignore any running delay timer. 
Here is that patch:

--- tools/attrd.orig.c    2011-09-13 08:29:46.946820348 -0500
+++ tools/attrd.c    2011-09-14 13:33:59.606894754 -0500
@@ -348,10 +348,14 @@
          attrd_local_callback(xml);

      } else if(ignore == NULL || safe_str_neq(from, attrd_uname)) {
+        const char *attr  = crm_element_value(xml, F_ATTRD_ATTRIBUTE);
+        /* Don't send update for score if msg is from other node */
+        if(safe_str_eq(from, attrd_uname) || safe_str_neq(attr, "pingd")) {
          crm_info("%s message from %s", op, from);
          hash_entry = find_hash_entry(xml);
          stop_attrd_timer(hash_entry);
          attrd_perform_update(hash_entry);
+        }
      }
      free_xml(xml);
  }


On 09/19/2011 10:51 PM, Andrew Beekhof wrote:
> On Sun, Sep 11, 2011 at 2:30 AM, Vadym Chepkov<vchepkov at gmail.com>  wrote:
>> On Sep 8, 2011, at 3:40 PM, Florian Haas wrote:
>>
>>>>> On 09/08/11 20:59, Brad Johnson wrote:
>>>>>> We have a 2 node cluster with a single resource. The resource must run
>>>>>> on only a single node at one time. Using the pacemaker:ocf:ping RA we
>>>>>> are pinging a WAN gateway and a LAN host on each node so the resource
>>>>>> runs on the node with the greatest connectivity. The problem is when a
>>>>>> ping host goes down (so both nodes lose connectivity to it), the
>>>>>> resource moves to the other node due to timing differences in how fast
>>>>>> they update the score attribute. The dampening value has no effect,
>>>>>> since it delays both nodes by the same amount. These unnecessary
>>>>>> fail-overs aren't acceptable since they are disruptive to the network
>>>>>> for no reason.
>>>>>> Is there a way to dampen the ping update by different amounts on the
>>>>>> active and passive nodes? Or some other way to configure the cluster to
>>>>>> try to keep the resource where it is during these tie score scenarios?
>>> location pingd-constraint group_1 \
>>>   rule $id="pingd-constraint-rule" pingd: defined pingd
>>>
>>> May I suggest that you simply change this constraint to
>>>
>>> location pingd-constraint group_1 \
>>>   rule $id="pingd-constraint-rule" \
>>>     -inf: not_defined pingd or pingd lte 0
>>>
>>> That way, only a host that definitely has _no_ connectivity carries a
>>> -INF score for that resource group. And I believe that is what you
>>> really want, rather than take the actual ping score as a placement
>>> weight (your "best connectivity" approach).
>>>
>>> Just my 2 cents, though.
>>>
>> Even though this approach was recommended many times, there is a problem with it.
>> What if all nodes for some reason are not able to ping ?
>> This rule would cause a resource to be brought down completely, whereas if you use "best connectivity" approach it will stay up where it was before network failed.
> If the outside[1] world can't reach the cluster, is there much benefit
> in having it running?
>
> [1] Substitute "outside" for wherever your users are, hopefully you
> picked a ping node from the same area.
>
>> Vadym
>>
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker