[Pacemaker] [PING] ping, pingd and CIB updates, pick your poison :)

Andrew Beekhof andrew at beekhof.net
Thu Sep 2 02:47:57 EDT 2010


On Fri, Jul 30, 2010 at 8:38 AM, Thomas Guthmann <tguthmann at iseek.com.au> wrote:
> Re,
>
>> [..] I can provide a hb_report if necessary.
> See in attachment a report for the simple config below. Note that I dumbly
> erased the conf before doing the report but I paste it below.
>
> I've done a very simple cluster. 2 nodes running a dummy resource with 2
> cloned pings testing a virtual IP that I activated or deactivated for the
> test.
>
> 11:20AM : cluster is up and running
> 11:25AM : shutdown the IP
> 11:30AM : force a refresh with attrd_updater (because pingd=1 still)
>          It doesn't change anything still seen as up...
> 11:37AM : change a value in the CIB dampen from 120 to 121 for instance
>          Now db2 pingd is null but db1 is still 1. crm changes have
>          been done on db2 - dunno if it's linked.
> 11:40AM : start the IP again
> 12:00AM : IP is still seen as down...
>
> primitive dummy ocf:pacemaker:Dummy
> primitive ping ocf:pacemaker:ping \
>    params host_list="IP.TO.TE.ST" dampen="121" attempts="3" debug="1"
> clone CONNECTIVITY ping
> location rule-connectivity dummy \
>    rule $id="rule-ping" -inf: not_defined pingd or pingd number:lte 0
> property $id="cib-bootstrap-options" \
>         dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="2" \
>         no-quorum-policy="ignore" \
>         pe-warn-series-max="2880" \
>         stonith-enabled="false"
> rsc_defaults $id="rsc_defaults-options" \
>         resource-stickiness="1"
>
> Thomas

It certainly looks like its working...

Jul 30 11:37:50 db1.icare.appnet.iseek.com.au ping[11699]: WARNING:
202.83.64.201 is inactive: PING 202.83.64.201 (202.83.64.201) 56(84)
bytes of data.#012#012--- 202.83.64.201 ping statistics ---#0123
packets transmitted, 0 received, +1 errors, 100% packet loss, time
2000ms

Followed by

Jul 30 11:37:50 db1.icare.appnet.iseek.com.au attrd_updater: [11718]:
info: Invoked: attrd_updater -n pingd -v 0 -d 121
Jul 30 11:39:51 db1.icare.appnet.iseek.com.au attrd: [3936]: info:
attrd_trigger_update: Sending flush op to all hosts for: pingd (0)
Jul 30 11:39:51 db1.icare.appnet.iseek.com.au attrd: [3936]: info:
attrd_perform_update: Sent update 65: pingd=0

Alas there is no debug running so I can't say for sure that the call
returned, but this makes it pretty likely:

Jul 30 11:39:51 db1.icare.appnet.iseek.com.au crmd: [3938]: info:
abort_transition_graph: te_update_diff:146 - Triggered transition
abort (complete=1, tag=transient_attributes,
id=db1.icare.appnet.iseek.com.au, magic=NA, cib=0.183.11) : Transient
attribute: update




More information about the Pacemaker mailing list