[Pacemaker] ping resource polling skew

Florian Crouzat gentoo at floriancrouzat.net
Wed Mar 20 04:09:41 EDT 2013


Le 20/03/2013 04:11, Quentin Smith a écrit :

>>> Is there any way to get Pacemaker to delay resource transitions until at
>>> least one full polling cycle has happened, so that in the event of an
>>> outage of the ping target, resources stay put where they are running?
>>
>> there is the "dampen" parameter .... use a high value like 3 or more
>> times the monitor-interval to give all nodes the chance to detect the
>> dead target(s), that should help.
>
> Does that actually help in this case? My understanding is that the
> dampen parameter will delay the attribute change for each host, but
> those delays will still tick down separately for each node, resulting in
> exactly the same behavior, just delayed by dampen seconds.
>

I have had the same questions, and I was quite surprised to see this 
issue wasn't really mentioned anywhere.
So far, I've been relying on the dampen parameter.
Here is my resource definition:

primitive ping-nq-sw-swsec ocf:pacemaker:ping \
         params host_list="192.168.10.1 192.168.2.11 192.168.2.12" 
dampen="35s" attempts="2" timeout="2" multiplier="100" \
         op monitor interval="15s"

As I understand it, a node cannot trigger any transition until 35s 
(dampen) had passed since this particular node lost a ping-node.
And by setting a monitor interval of 15s, I can be sure that within this 
35s, all nodes should have marked that ping-node as dead and continue to 
all have a common score => nothing moves (35s > 2*15s so at least all 
nodes have pinged twice during the dampen delay)

Hope that helps.

-- 
Cheers,
Florian Crouzat




More information about the Pacemaker mailing list