[ClusterLabs] Cannot clean history
Alexandre
alxgomz at gmail.com
Tue May 26 10:58:15 CEST 2015
Hi Andrew,
Here is the output of the verbose crm_failcount.
trace: set_crm_log_level: New log level: 8
trace: cib_native_signon_raw: Connecting cib_rw channel
trace: pick_ipc_buffer: Using max message size of 524288
debug: qb_rb_open_2: shm size:524301; real_size:528384;
rb->word_size:132096
debug: qb_rb_open_2: shm size:524301; real_size:528384;
rb->word_size:132096
debug: qb_rb_open_2: shm size:524301; real_size:528384;
rb->word_size:132096
trace: mainloop_add_fd: Added connection 1 for cib_rw[0x1fd79c0].4
trace: pick_ipc_buffer: Using max message size of 51200
trace: crm_ipc_send: Sending from client: cib_rw request id: 1
bytes: 131 timeout:-1 msg...
trace: crm_ipc_send: Recieved response 1, size=140, rc=140, text:
<cib_common_callback_worker cib_op="register"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
trace: cib_native_signon_raw: reg-reply
<cib_common_callback_worker cib_op="register"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
debug: cib_native_signon_raw: Connection to CIB successful
trace: cib_create_op: Sending call options: 00001100, 4352
trace: cib_native_perform_op_delegate: Sending cib_query message to
CIB service (timeout=120s)
trace: crm_ipc_send: Sending from client: cib_rw request id: 2
bytes: 211 timeout:120000 msg...
trace: internal_ipc_get_reply: client cib_rw waiting on reply to msg
id 2
trace: crm_ipc_send: Recieved response 2, size=944, rc=944, text:
<cib-reply t="cib" cib_op="cib_query" cib_callid="2"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
cib_rc="0"><cib_calldata><nodes><node uname="node2.domain.com" id="o
trace: cib_native_perform_op_delegate: Reply <cib-reply t="cib"
cib_op="cib_query" cib_callid="2"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
cib_rc="0">
trace: cib_native_perform_op_delegate: Reply <cib_calldata>
trace: cib_native_perform_op_delegate: Reply <nodes>
trace: cib_native_perform_op_delegate: Reply <node uname="
node2.domain.com" id="node2.domain.com">
trace: cib_native_perform_op_delegate: Reply
<instance_attributes id="nodes-node2.domain.com">
trace: cib_native_perform_op_delegate: Reply <nvpair
id="nodes-node2.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="STREAMING|SYNC"/>
trace: cib_native_perform_op_delegate: Reply <nvpair
id="nodes-node2.domain.com-standby" name="standby" value="off"/>
trace: cib_native_perform_op_delegate: Reply
</instance_attributes>
trace: cib_native_perform_op_delegate: Reply </node>
trace: cib_native_perform_op_delegate: Reply <node uname="
node1.domain.com" id="node1.domain.com">
trace: cib_native_perform_op_delegate: Reply
<instance_attributes id="nodes-node1.domain.com">
trace: cib_native_perform_op_delegate: Reply <nvpair
id="nodes-node1.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="LATEST"/>
trace: cib_native_perform_op_delegate: Reply <nvpair
id="nodes-node1.domain.com-standby" name="standby" value="off"/>
trace: cib_native_perform_op_delegate: Reply
</instance_attributes>
trace: cib_native_perform_op_delegate: Reply </node>
trace: cib_native_perform_op_delegate: Reply </nodes>
trace: cib_native_perform_op_delegate: Reply </cib_calldata>
trace: cib_native_perform_op_delegate: Reply </cib-reply>
trace: cib_native_perform_op_delegate: Syncronous reply 2 received
debug: get_cluster_node_uuid: Result section <nodes>
debug: get_cluster_node_uuid: Result section <node uname="
node2.domain.com" id="node2.domain.com">
debug: get_cluster_node_uuid: Result section
<instance_attributes id="nodes-node2.domain.com">
debug: get_cluster_node_uuid: Result section <nvpair
id="nodes-node2.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="STREAMING|SYNC"/>
debug: get_cluster_node_uuid: Result section <nvpair
id="nodes-node2.domain.com-standby" name="standby" value="off"/>
debug: get_cluster_node_uuid: Result section
</instance_attributes>
debug: get_cluster_node_uuid: Result section </node>
debug: get_cluster_node_uuid: Result section <node uname="
node1.domain.com" id="node1.domain.com">
debug: get_cluster_node_uuid: Result section
<instance_attributes id="nodes-node1.domain.com">
debug: get_cluster_node_uuid: Result section <nvpair
id="nodes-node1.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="LATEST"/>
debug: get_cluster_node_uuid: Result section <nvpair
id="nodes-node1.domain.com-standby" name="standby" value="off"/>
debug: get_cluster_node_uuid: Result section
</instance_attributes>
debug: get_cluster_node_uuid: Result section </node>
debug: get_cluster_node_uuid: Result section </nodes>
info: query_node_uuid: Mapped node1.domain.com to node1.domain.com
trace: pick_ipc_buffer: Using max message size of 51200
info: attrd_update_delegate: Connecting to cluster... 5 retries
remaining
debug: qb_rb_open_2: shm size:51213; real_size:53248;
rb->word_size:13312
debug: qb_rb_open_2: shm size:51213; real_size:53248;
rb->word_size:13312
debug: qb_rb_open_2: shm size:51213; real_size:53248;
rb->word_size:13312
trace: crm_ipc_send: Sending from client: attrd request id: 3 bytes:
168 timeout:5000 msg...
trace: internal_ipc_get_reply: client attrd waiting on reply to msg
id 3
trace: crm_ipc_send: Recieved response 3, size=88, rc=88, text: <ack
function="attrd_ipc_dispatch" line="129"/>
debug: attrd_update_delegate: Sent update: (null)=(null) for
node1.domain.com
info: main: Update (null)=<none> sent via attrd
debug: cib_native_signoff: Signing out of the CIB Service
trace: mainloop_del_fd: Removing client cib_rw[0x1fd79c0]
trace: mainloop_gio_destroy: Destroying client cib_rw[0x1fd79c0]
trace: crm_ipc_close: Disconnecting cib_rw IPC connection 0x1fdb020
(0x1fdb1a0.(nil))
debug: qb_ipcc_disconnect: qb_ipcc_disconnect()
trace: qb_rb_close: ENTERING qb_rb_close()
debug: qb_rb_close: Closing ringbuffer:
/dev/shm/qb-cib_rw-request-8347-9344-14-header
trace: qb_rb_close: ENTERING qb_rb_close()
debug: qb_rb_close: Closing ringbuffer:
/dev/shm/qb-cib_rw-response-8347-9344-14-header
trace: qb_rb_close: ENTERING qb_rb_close()
debug: qb_rb_close: Closing ringbuffer:
/dev/shm/qb-cib_rw-event-8347-9344-14-header
trace: cib_native_destroy: destroying 0x1fd7910
trace: crm_ipc_destroy: Destroying IPC connection to cib_rw:
0x1fdb020
trace: mainloop_gio_destroy: Destroyed client cib_rw[0x1fd79c0]
trace: crm_exit: cleaning up libxml
info: crm_xml_cleanup: Cleaning up memory from libxml2
trace: crm_exit: exit 0
I hope it helps.
2015-05-20 6:34 GMT+02:00 Andrew Beekhof <andrew at beekhof.net>:
>
> > On 4 May 2015, at 6:43 pm, Alexandre <alxgomz at gmail.com> wrote:
> >
> > Hi,
> >
> > I have a pacemaker / corosync / cman cluster running on redhat 6.6.
> > Although cluster is working as expected, I have some trace of old
> failures (several monthes ago) I can't gert rid of.
> > Basically I have set cluster-recheck-interval="300" and
> failure-timeout="600" (in rsc_defaults) as shown bellow:
> >
> > property $id="cib-bootstrap-options" \
> > dc-version="1.1.10-14.el6-368c726" \
> > cluster-infrastructure="cman" \
> > expected-quorum-votes="2" \
> > no-quorum-policy="ignore" \
> > stonith-enabled="false" \
> > last-lrm-refresh="1429702408" \
> > maintenance-mode="false" \
> > cluster-recheck-interval="300"
> > rsc_defaults $id="rsc-options" \
> > failure-timeout="600"
> >
> > So I would expect old failure to be purged from the cib long ago, but
> actually I have the following when issuing crm_mon -frA1.
>
> I think automatic deletion didnt arrive until later.
>
> >
> > Migration summary:
> > * Node host1:
> > etc_ml_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> > spool_postfix_drbd_msg: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> > lib_ml_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> > lib_imap_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> > spool_imap_drbd: migration-threshold=1000000 fail-count=11654
> last-failure='Sat Feb 14 17:04:05 2015'
> > spool_ml_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> > documents_drbd: migration-threshold=1000000 fail-count=248
> last-failure='Sat Feb 14 17:58:55 2015'
> > * Node host2
> > documents_drbd: migration-threshold=1000000 fail-count=548
> last-failure='Sat Feb 14 16:26:33 2015'
> >
> > I have tried to crm_failcount -D the resources also tried cleanup... but
> it's still there!
>
> Oh? Can you re-run with -VVVVVV and show us the result?
>
> > How can I get reid of those record (so my monitoring tools stop
> complaining) .
> >
> > Regards.
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://clusterlabs.org/pipermail/users/attachments/20150526/791bf447/attachment-0001.html>
More information about the Users
mailing list