[Pacemaker] Trouble with "Failed application of an update diff"
Виталий Туровец
corebug at corebug.net
Fri May 30 08:32:00 UTC 2014
Hello there, people!
I am new to this list, so please excuse me if i'm posting to the wrong
place.
I've got a pacemaker cluster with such a configuration:
http://pastebin.com/1SbWWh4n.
Output of "crm status":
============
Last updated: Fri May 30 11:22:59 2014
Last change: Thu May 29 03:22:38 2014 via crmd on wb-db2
Stack: openais
Current DC: wb-db2 - partition with quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
2 Nodes configured, 2 expected votes
7 Resources configured.
============
Online: [ wb-db2 wb-db1 ]
ClusterIP (ocf::heartbeat:IPaddr2): Started wb-db2
MySQL_Reader_VIP (ocf::heartbeat:IPaddr2): Started wb-db2
resMON (ocf::pacemaker:ClusterMon): Started wb-db2
Master/Slave Set: MySQL_MasterSlave [MySQL]
Masters: [ wb-db2 ]
Stopped: [ MySQL:1 ]
Clone Set: pingclone [ping-gateway]
Started: [ wb-db1 wb-db2 ]
There was an unclean shutdown of a cluster and after that i've faced a
problem that a slave of MySQL_MasterSlave resource does not come up.
When i try to do a "cleanup MySQL_MasterSlave" i see such thing in logs:
May 29 03:22:22 [4423] wb-db1 crmd: warning: decode_transition_key:
Bad UUID (crm-resource-4819) in sscanf result (3) for
0:0:crm-resource-4819
May 29 03:22:22 [4423] wb-db1 crmd: warning: decode_transition_key:
Bad UUID (crm-resource-4819) in sscanf result (3) for
0:0:crm-resource-4819
May 29 03:22:22 [4423] wb-db1 crmd: info: ais_dispatch_message:
Membership 408: quorum retained
May 29 03:22:22 [4418] wb-db1 cib: info: set_crm_log_level: New
log level: 3 0
May 29 03:22:38 [4421] wb-db1 attrd: notice: attrd_ais_dispatch:
Update relayed from wb-db2
May 29 03:22:38 [4421] wb-db1 attrd: notice: attrd_ais_dispatch:
Update relayed from wb-db2
May 29 03:22:38 [4418] wb-db1 cib: info: apply_xml_diff:
Digest mis-match: expected 2f5bc3d7f673df3cf37f774211976d69, calculated
b8a7adf0e34966242551556aab605286
May 29 03:22:38 [4418] wb-db1 cib: notice: cib_process_diff:
Diff 0.243.4 -> 0.243.5 not applied to 0.243.4: Failed application of an
update diff
May 29 03:22:38 [4418] wb-db1 cib: info:
cib_server_process_diff: Requesting re-sync from peer
May 29 03:22:38 [4418] wb-db1 cib: notice:
cib_server_process_diff: Not applying diff 0.243.4 -> 0.243.5 (sync in
progress)
May 29 03:22:38 [4418] wb-db1 cib: info: cib_replace_notify:
Replaced: -1.-1.-1 -> 0.243.5 from wb-db2
May 29 03:22:38 [4421] wb-db1 attrd: notice: attrd_trigger_update:
Sending flush op to all hosts for: pingd (100)
May 29 03:22:38 [4421] wb-db1 attrd: notice: attrd_trigger_update:
Sending flush op to all hosts for: probe_complete (true)
May 29 03:22:38 [4418] wb-db1 cib: info: set_crm_log_level: New
log level: 3 0
May 29 03:22:38 [4418] wb-db1 cib: info: apply_xml_diff:
Digest mis-match: expected 754ed3b1d999e34d93e0835b310fd98a, calculated
c322686deb255936ab54e064c696b6b8
May 29 03:22:38 [4418] wb-db1 cib: notice: cib_process_diff:
Diff 0.244.5 -> 0.244.6 not applied to 0.244.5: Failed application of an
update diff
May 29 03:22:38 [4418] wb-db1 cib: info:
cib_server_process_diff: Requesting re-sync from peer
May 29 03:22:38 [4423] wb-db1 crmd: info: delete_resource:
Removing resource MySQL:0 for 4996_crm_resource (internal) on wb-db2
May 29 03:22:38 [4423] wb-db1 crmd: info: notify_deleted:
Notifying 4996_crm_resource on wb-db2 that MySQL:0 was deleted
May 29 03:22:38 [4418] wb-db1 cib: notice:
cib_server_process_diff: Not applying diff 0.244.5 -> 0.244.6 (sync in
progress)
May 29 03:22:38 [4423] wb-db1 crmd: warning: decode_transition_key:
Bad UUID (crm-resource-4996) in sscanf result (3) for
0:0:crm-resource-4996
May 29 03:22:38 [4418] wb-db1 cib: notice:
cib_server_process_diff: Not applying diff 0.244.6 -> 0.244.7 (sync in
progress)
May 29 03:22:38 [4418] wb-db1 cib: notice:
cib_server_process_diff: Not applying diff 0.244.7 -> 0.244.8 (sync in
progress)
May 29 03:22:38 [4418] wb-db1 cib: info: cib_replace_notify:
Replaced: -1.-1.-1 -> 0.244.8 from wb-db2
May 29 03:22:38 [4421] wb-db1 attrd: notice: attrd_trigger_update:
Sending flush op to all hosts for: pingd (100)
May 29 03:22:38 [4421] wb-db1 attrd: notice: attrd_trigger_update:
Sending flush op to all hosts for: probe_complete (true)
May 29 03:22:38 [4423] wb-db1 crmd: notice: do_lrm_invoke: Not
creating resource for a delete event: (null)
May 29 03:22:38 [4423] wb-db1 crmd: info: notify_deleted:
Notifying 4996_crm_resource on wb-db2 that MySQL:1 was deleted
May 29 03:22:38 [4423] wb-db1 crmd: warning: decode_transition_key:
Bad UUID (crm-resource-4996) in sscanf result (3) for
0:0:crm-resource-4996
May 29 03:22:38 [4423] wb-db1 crmd: warning: decode_transition_key:
Bad UUID (crm-resource-4996) in sscanf result (3) for
0:0:crm-resource-4996
May 29 03:22:38 [4418] wb-db1 cib: info: set_crm_log_level: New
log level: 3 0
May 29 03:22:38 [4423] wb-db1 crmd: info: ais_dispatch_message:
Membership 408: quorum retained
Here's the cibadmin -Q output from node that is alive:
http://pastebin.com/aeqfTaCe
And here's the one from failed node: http://pastebin.com/ME2U5vjK
The question is: how do i somehow cleanup the things for master/slave
resource MySQL_MasterSlave to start working properly?
Thank you!
--
~~~
WBR,
Vitaliy Turovets
Lead Operations Engineer
Global Message Services Ukraine
+38(093)265-70-55
VITU-RIPE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140530/b1926fed/attachment-0003.html>
More information about the Pacemaker
mailing list