[Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

Yusuke Iida yusk.iida at gmail.com
Fri Mar 7 01:35:52 EST 2014


Hi, Andrew
2014-03-07 11:43 GMT+09:00 Andrew Beekhof <andrew at beekhof.net>:
> I don't understand... crm_mon doesn't look for changes to resources or constraints and it should already be using the new faster diff format.
>
> [/me reads attachment]
>
> Ah, but perhaps I do understand afterall :-)
>
> This is repeated over and over:
>
>   notice: crm_diff_update:      [cib_diff_notify] Patch aborted: Application of an update diff failed (-206)
>   notice: xml_patch_version_check:      Current num_updates is too high (885 > 67)
>
> That would certainly drive up CPU usage and cause crm_mon to get left behind.
> Happily the fix for that should be: https://github.com/beekhof/pacemaker/commit/6c33820

I think that refreshment of cib is no longer repeated when a version
has a difference.
Thank you cope.

Now, I see another problem.

If "crm configure load update" is performed, with crm_mon started,
information will no longer be displayed.
Information will be displayed if crm_mon is restarted.

I executed the following commands and took the log of crm_mon.
# crm_mon --disable-ncurses -VVVVVV >crm_mon.log 2>&1

I am observing the cib information inside crm_mon after load was performed.

Two configuration sections exist in cib after load.

It seems that this is the next processing, and it remains since it
failed in deletion of the configuration section.
   trace: cib_native_dispatch_internal:         cib-reply
<change operation="delete" path="/configuration"/>

A little following is the debugging log acquired by old pacemaker.
It is not found in order that <(null) > may try to look for
path=/configuration from the document tree of top.
Should not path be path=/cib/configuration essentially?

notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:   <(null)>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:     <cib
epoch="2" num_updates="6" admin_epoch="0"
validate-with="pacemaker-1.2" crm_feature_set="3.0.9"
cib-last-written="Tue Mar  4 11:32:36 2014"
update-origin="rhel64rpmbuild" update-client="crmd" have-quorum="1"
dc-uuid="3232261524">
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:       <configuration>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:         <crm_config>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<cluster_property_set id="cib-bootstrap-options">
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
value="1.1.10-2dbaf19"/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<nvpair id="cib-bootstrap-options-cluster-infrastructure"
name="cluster-infrastructure" value="corosync"/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
</cluster_property_set>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:         </crm_config>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:         <nodes>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<node id="3232261524" uname="rhel64rpmbuild"/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:         </nodes>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:         <resources/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:         <constraints/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:       </configuration>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:       <status>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<node_state id="3232261524" uname="rhel64rpmbuild" in_ccm="true"
crmd="online" crm-debug-origin="do_state_transition" join="member"
expected="member">
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<lrm id="3232261524">
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<lrm_resources/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:           </lrm>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<transient_attributes id="3232261524">
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<instance_attributes id="status-3232261524">
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<nvpair id="status-3232261524-shutdown" name="shutdown" value="0"/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
<nvpair id="status-3232261524-probe_complete" name="probe_complete"
value="true"/>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
</instance_attributes>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:
</transient_attributes>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:         </node_state>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:       </status>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:     </cib>
notice  Mar 04 11:33:10 __xml_find_path(1294):0: IDEBUG:   </(null)>


Is this the already recognized problem?

I attach the report at the time of this occurring, and the log of crm_mon.

- crm_report
https://drive.google.com/file/d/0BwMFJItoO-fVWEw4Qnp0aHIzSm8/edit?usp=sharing
- crm_mon.log
https://drive.google.com/file/d/0BwMFJItoO-fVRDRMTGtUUEdBc1E/edit?usp=sharing

Regards,
Yusuke


-- 
----------------------------------------
METRO SYSTEMS CO., LTD

Yusuke Iida
Mail: yusk.iida at gmail.com
----------------------------------------




More information about the Pacemaker mailing list