[Pacemaker] Pacemaker 1.1.10 rc 5 & rc 6

Andrii Moiseiev amoiseiev at gmail.com
Fri Jul 5 11:28:58 EDT 2013


Hi,

I'm trying to update pacemaker on centos 6.4 hosts but each release
introduces some new problems %).

we have centos 6.4 corosync, cman packages and latest pcs / pacemaker.
Cluster is cman based.

Pacemaker 1.1.10 rc5 was almost nice, excluding repeating message on all of
our nodes:

Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:   notice:
update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an
update diff failed (-206)
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:  warning:
cib_process_diff: Diff 0.4.83 -> 0.4.84 from local not applied to 0.4.83:
Failed application of an update diff
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:   notice:
update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an
update diff failed (-206)
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:  warning:
cib_process_diff: Diff 0.4.84 -> 0.4.85 from local not applied to 0.4.84:
Failed application of an update diff
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:   notice:
update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an
update diff failed (-206)
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:  warning:
cib_process_diff: Diff 0.4.85 -> 0.4.86 from local not applied to 0.4.85:
Failed application of an update diff
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:   notice:
update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an
update diff failed (-206)
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:  warning:
cib_process_diff: Diff 0.4.86 -> 0.4.87 from local not applied to 0.4.86:
Failed application of an update diff
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:   notice:
update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an
update diff failed (-206)
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:  warning:
cib_process_diff: Diff 0.4.87 -> 0.4.88 from local not applied to 0.4.87:
Failed application of an update diff
Jul  5 10:32:34 devpacemaker01 stonith-ng[14501]:   notice:
update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an
update diff failed (-206)
Jul  5 10:32:35 devpacemaker01 stonith-ng[14501]:  warning:
cib_process_diff: Diff 0.4.88 -> 0.4.89 from local not applied to 0.4.88:
Failed application of an update diff
Jul  5 10:32:35 devpacemaker01 stonith-ng[14501]:   notice:
update_cib_cache_cb: [cib_diff_notify] Patch aborted: Application of an
update diff failed (-206)

Not critical, as the stuff worked, but looks strange, it doesn't matter
what you do, it keeps complaining. Full CIB resync, fresh configuration
importing , nothing helps.

Cluster Properties:
 cluster-delay: 10s
 cluster-infrastructure: cman
 cluster-recheck-interval: 2min
 last-lrm-refresh: 1373023780
 no-quorum-policy: freeze
 start-failure-is-fatal: true
 stonith-enabled: false

today, we upgraded to 1.1.10 rc6 and it made it worse... Also, it broke
'default' fencing. Previously, even with  stonith-enabled: false, pacemaker
was trying to kill cman / corosync if connection is lost or split brain
occurs, but now it's not happening:

Jul  5 09:54:25 devpacemaker01 crmd[20840]:   notice:
tengine_stonith_notify: Peer devpacemaker03_eth1 was not terminated
(reboot) by devpacemaker02_eth1 for devpacemaker02_eth1: No such device
(ref=1fc11b87-529d-4f6c-b4e6-ffaa82c06bd8) by client stonith_admin.cman.8832
Jul  5 09:54:28 devpacemaker01 stonith-ng[20838]:   notice: remote_op_done:
Operation reboot of devpacemaker03_eth1 by devpacemaker02_eth1 for
stonith_admin.cman.8855 at devpacemaker02_eth1.6e0e0da3: No such device
Jul  5 09:54:28 devpacemaker01 crmd[20840]:   notice:
tengine_stonith_notify: Peer devpacemaker03_eth1 was not terminated
(reboot) by devpacemaker02_eth1 for devpacemaker02_eth1: No such device
(ref=6e0e0da3-f9f9-43a0-933e-0ff9ec2cb390) by client stonith_admin.cman.8855
Jul  5 09:54:31 devpacemaker01 stonith-ng[20838]:   notice: remote_op_done:
Operation reboot of devpacemaker03_eth1 by devpacemaker02_eth1 for
stonith_admin.cman.9017 at devpacemaker02_eth1.955b859b: No such device
Jul  5 09:54:31 devpacemaker01 crmd[20840]:   notice:
tengine_stonith_notify: Peer devpacemaker03_eth1 was not terminated
(reboot) by devpacemaker02_eth1 for devpacemaker02_eth1: No such device
(ref=955b859b-791e-4083-b760-a6f8f05ddc2f) by client stonith_admin.cman.9017
Jul  5 09:54:35 devpacemaker01 stonith-ng[20838]:   notice: remote_op_done:
Operation reboot of devpacemaker03_eth1 by devpacemaker02_eth1 for
stonith_admin.cman.9089 at devpacemaker02_eth1.ede9aa4e: No such device
Jul  5 09:54:35 devpacemaker01 crmd[20840]:   notice:
tengine_stonith_notify: Peer devpacemaker03_eth1 was not terminated
(reboot) by devpacemaker02_eth1 for devpacemaker02_eth1: No such device
(ref=ede9aa4e-32e0-4f3d-bd3a-f519c1250363) by client stonith_admin.cman.9089
Jul  5 09:54:38 devpacemaker01 stonith-ng[20838]:   notice: remote_op_done:
Operation reboot of devpacemaker03_eth1 by devpacemaker02_eth1 for
stonith_admin.cman.9242 at devpacemaker02_eth1.2d92ca8d: No such device
Jul  5 09:54:38 devpacemaker01 crmd[20840]:   notice:
tengine_stonith_notify: Peer devpacemaker03_eth1 was not terminated
(reboot) by devpacemaker02_eth1 for devpacemaker02_eth1: No such device
(ref=2d92ca8d-2fe5-46de-bd62-ac6446074c5e) by client stonith_admin.cman.9242
Jul  5 09:54:42 devpacemaker01 stonith-ng[20838]:   notice: remote_op_done:
Operation reboot of devpacemaker03_eth1 by devpacemaker02_eth1 for
stonith_admin.cman.9386 at devpacemaker02_eth1.a1b0dc29: No such device
Jul  5 09:54:42 devpacemaker01 crmd[20840]:   notice:
tengine_stonith_notify: Peer devpacemaker03_eth1 was not terminated
(reboot) by devpacemaker02_eth1 for devpacemaker02_eth1: No such device
(ref=a1b0dc29-f817-4b2a-bd56-dc9697a4f68f) by client stonith_admin.cman.9386
Jul  5 09:58:07 devpacemaker01 stonith-ng[20838]:    error: remote_op_done:
Operation reboot of devpacemaker03_eth1 by devpacemaker01_eth1 for
stonith_admin.25870 at devpacemaker01_eth1.f5ed657f: No such device
Jul  5 09:58:07 devpacemaker01 crmd[20840]:   notice:
tengine_stonith_notify: Peer devpacemaker03_eth1 was not terminated
(reboot) by devpacemaker01_eth1 for devpacemaker01_eth1: No such device
(ref=f5ed657f-bdd8-429f-9f09-d3c82ee1201c) by client stonith_admin.25870

and I see no way to stop this message, I restarted pacemaker on all nodes
but it didn't help. The other bad thing here, is that the node, which
previously lost network connection: devpacemaker03_eth1 (which caused
resource failover) took resource back disregarding resource stickness
settings. It worked just fine on rc5. Weird.

I had to revert everything to 1.1.8.

Any clues?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130705/a3312d68/attachment-0002.html>


More information about the Pacemaker mailing list