[Pacemaker] removed resources still generating log entries
Kevin Maguire
kmaguire at eso.org
Thu Dec 22 12:43:51 CET 2011
Hi
We have built a cluster on top of the SLES 11 SP1 stack, which manages various Xen VMs.
In the development phase we used some test VM resources, which have since been removed from the resource list. However I see some remnants of these old resources in the log files, and would like to xclean this up.
e.g. I see
Dec 22 12:27:18 node2 pengine: [6262]: info: get_failcount: hvm1 has failed 1 times on node2
Dec 22 12:27:18 node2 pengine: [6262]: notice: common_apply_stickiness: hvm1 can fail 999999 more times on node2 before being forced off
Dec 22 12:27:18 node2 attrd: [6261]: info: attrd_trigger_update: Sending flush op to all hosts for: fail-count-hvm1 (1)
Dec 22 12:27:18 node2 attrd: [6261]: info: attrd_trigger_update: Sending flush op to all hosts for: last-failure-hvm1 (1322579680)
hvm1 was a VM in that test phase.
If I do a dump of the CIB, I find this section
<status>
<node_state uname="node2" ha="active" in_ccm="true" crmd="online" join="member" expected="member" shutdown="0" id="node2" crm-debug-origin="do_state_transition">
<lrm id="node2">
<lrm_resources>
...
<lrm_resource id="hvm1" type="Xen" class="ocf" provider="heartbeat">
<lrm_rsc_op id="hvm1_monitor_0" operation="monitor" crm-debug-origin="build_active_RAs" crm_feature_set="3.0.2" transition-key="20:11:7:1fd9e9b1-610e-4768-abd5-35ea3ce45c4d" transition-magic="0:7;20:11:7:1fd9e9b1-610e-4768-abd5-35ea3ce45c4d" call-id="27" rc-code="7" op-status="0" interval="0" last-run="1322130825" last-rc-change="1322130825" exec-time="550" queue-time="0" op-digest="71594dc818f53dfe034bb5e84c6d80fb"/>
<lrm_rsc_op id="hvm1_stop_0" operation="stop" crm-debug-origin="build_active_RAs" crm_feature_set="3.0.2" transition-key="61:511:0:abda911e-05ed-4e11-8e25-ab03a1bfd7b7" transition-magic="0:0;61:511:0:abda911e-05ed-4e11-8e25-ab03a1bfd7b7" call-id="56" rc-code="0" op-status="0" interval="0" last-run="1322580820" last-rc-change="1322580820" exec-time="164320" queue-time="0" op-digest="71594dc818f53dfe034bb5e84c6d80fb"/>
<lrm_rsc_op id="hvm1_start_0" operation="start" crm-debug-origin="build_active_RAs" crm_feature_set="3.0.2" transition-key="59:16:0:1fd9e9b1-610e-4768-abd5-35ea3ce45c4d" transition-magic="0:0;59:16:0:1fd9e9b1-610e-4768-abd5-35ea3ce45c4d" call-id="30" rc-code="0" op-status="0" interval="0" last-run="1322131559" last-rc-change="1322131559" exec-time="470" queue-time="0" op-digest="71594dc818f53dfe034bb5e84c6d80fb"/>
</lrm_resource>
...
I tried
cibadmin -Q > tmp.xml
vi tmp.xml
cibadmin --replace --xml-file tmp.xml
but this does not do the job, I guess because the problematic bits are in the status section.
Any clue how to clean this up properly, preferably without any cluster downtime?
Thanks,
Kevin
version info
node2 # rpm -qa | egrep "heartbeat|pacemaker|cluster|openais"
libopenais3-1.1.2-0.5.19
pacemaker-mgmt-2.0.0-0.2.19
openais-1.1.2-0.5.19
cluster-network-kmp-xen-1.4_2.6.32.12_0.6-2.1.73
libpacemaker3-1.1.2-0.2.1
drbd-heartbeat-8.3.7-0.4.15
cluster-glue-1.0.5-0.5.1
drbd-pacemaker-8.3.7-0.4.15
cluster-network-kmp-default-1.4_2.6.32.12_0.6-2.1.73
pacemaker-1.1.2-0.2.1
yast2-cluster-2.15.0-8.6.19
pacemaker-mgmt-client-2.0.0-0.2.19
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20111222/d877bbc9/attachment.html>
More information about the Pacemaker
mailing list