[Pacemaker] resource moving unnecessarily due to ping race condition

Brad Johnson bjohnson at ecessa.com
Thu Sep 8 15:59:06 EDT 2011


Thank you for your quick response. First, ideally we do want the "best 
connectivity" approach. Assuming each node is connected to the ping 
hosts via separate NIC's, switches, cables, etc, a failure in one of 
those components will result in one node having degraded network 
connectivity. But if that approach is not possible, without experiencing 
the spurious fail-overs in tie score situations, then we may have to 
settle for the "all or nothing" approach that you suggest. Second, 
doesn't your suggested approach have the same race condition problem if 
both nodes score drop to zero (e.g. connectivity lost from both nodes to 
a single ping host)?

Regards,
Brad

On 09/08/2011 02:40 PM, Florian Haas wrote:
>>> On 09/08/11 20:59, Brad Johnson wrote:
>>>> We have a 2 node cluster with a single resource. The resource must run
>>>> on only a single node at one time. Using the pacemaker:ocf:ping RA we
>>>> are pinging a WAN gateway and a LAN host on each node so the resource
>>>> runs on the node with the greatest connectivity. The problem is when a
>>>> ping host goes down (so both nodes lose connectivity to it), the
>>>> resource moves to the other node due to timing differences in how fast
>>>> they update the score attribute. The dampening value has no effect,
>>>> since it delays both nodes by the same amount. These unnecessary
>>>> fail-overs aren't acceptable since they are disruptive to the network
>>>> for no reason.
>>>> Is there a way to dampen the ping update by different amounts on the
>>>> active and passive nodes? Or some other way to configure the cluster to
>>>> try to keep the resource where it is during these tie score scenarios?
> location pingd-constraint group_1 \
>    rule $id="pingd-constraint-rule" pingd: defined pingd
>
> May I suggest that you simply change this constraint to
>
> location pingd-constraint group_1 \
>    rule $id="pingd-constraint-rule" \
>      -inf: not_defined pingd or pingd lte 0
>
> That way, only a host that definitely has _no_ connectivity carries a
> -INF score for that resource group. And I believe that is what you
> really want, rather than take the actual ping score as a placement
> weight (your "best connectivity" approach).
>
> Just my 2 cents, though.
>
> Cheers,
> Florian
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110908/a35f1f8c/attachment-0003.html>


More information about the Pacemaker mailing list