[ClusterLabs] Cannot clean history

Alexandre alxgomz at gmail.com
Tue May 26 10:58:15 CEST 2015


Hi Andrew,

Here is the output of the verbose crm_failcount.

   trace: set_crm_log_level:     New log level: 8
   trace: cib_native_signon_raw:     Connecting cib_rw channel
   trace: pick_ipc_buffer:     Using max message size of 524288
   debug: qb_rb_open_2:     shm size:524301; real_size:528384;
rb->word_size:132096
   debug: qb_rb_open_2:     shm size:524301; real_size:528384;
rb->word_size:132096
   debug: qb_rb_open_2:     shm size:524301; real_size:528384;
rb->word_size:132096
   trace: mainloop_add_fd:     Added connection 1 for cib_rw[0x1fd79c0].4
   trace: pick_ipc_buffer:     Using max message size of 51200
   trace: crm_ipc_send:     Sending from client: cib_rw request id: 1
bytes: 131 timeout:-1 msg...
   trace: crm_ipc_send:     Recieved response 1, size=140, rc=140, text:
<cib_common_callback_worker cib_op="register"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
   trace: cib_native_signon_raw:     reg-reply
<cib_common_callback_worker cib_op="register"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
   debug: cib_native_signon_raw:     Connection to CIB successful
   trace: cib_create_op:     Sending call options: 00001100, 4352
   trace: cib_native_perform_op_delegate:     Sending cib_query message to
CIB service (timeout=120s)
   trace: crm_ipc_send:     Sending from client: cib_rw request id: 2
bytes: 211 timeout:120000 msg...
   trace: internal_ipc_get_reply:     client cib_rw waiting on reply to msg
id 2
   trace: crm_ipc_send:     Recieved response 2, size=944, rc=944, text:
<cib-reply t="cib" cib_op="cib_query" cib_callid="2"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
cib_rc="0"><cib_calldata><nodes><node uname="node2.domain.com" id="o
   trace: cib_native_perform_op_delegate:     Reply   <cib-reply t="cib"
cib_op="cib_query" cib_callid="2"
cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
cib_rc="0">
   trace: cib_native_perform_op_delegate:     Reply     <cib_calldata>
   trace: cib_native_perform_op_delegate:     Reply       <nodes>
   trace: cib_native_perform_op_delegate:     Reply         <node uname="
node2.domain.com" id="node2.domain.com">
   trace: cib_native_perform_op_delegate:     Reply
<instance_attributes id="nodes-node2.domain.com">
   trace: cib_native_perform_op_delegate:     Reply             <nvpair
id="nodes-node2.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="STREAMING|SYNC"/>
   trace: cib_native_perform_op_delegate:     Reply             <nvpair
id="nodes-node2.domain.com-standby" name="standby" value="off"/>
   trace: cib_native_perform_op_delegate:     Reply
</instance_attributes>
   trace: cib_native_perform_op_delegate:     Reply         </node>
   trace: cib_native_perform_op_delegate:     Reply         <node uname="
node1.domain.com" id="node1.domain.com">
   trace: cib_native_perform_op_delegate:     Reply
<instance_attributes id="nodes-node1.domain.com">
   trace: cib_native_perform_op_delegate:     Reply             <nvpair
id="nodes-node1.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="LATEST"/>
   trace: cib_native_perform_op_delegate:     Reply             <nvpair
id="nodes-node1.domain.com-standby" name="standby" value="off"/>
   trace: cib_native_perform_op_delegate:     Reply
</instance_attributes>
   trace: cib_native_perform_op_delegate:     Reply         </node>
   trace: cib_native_perform_op_delegate:     Reply       </nodes>
   trace: cib_native_perform_op_delegate:     Reply     </cib_calldata>
   trace: cib_native_perform_op_delegate:     Reply   </cib-reply>
   trace: cib_native_perform_op_delegate:     Syncronous reply 2 received
   debug: get_cluster_node_uuid:     Result section   <nodes>
   debug: get_cluster_node_uuid:     Result section     <node uname="
node2.domain.com" id="node2.domain.com">
   debug: get_cluster_node_uuid:     Result section
<instance_attributes id="nodes-node2.domain.com">
   debug: get_cluster_node_uuid:     Result section         <nvpair
id="nodes-node2.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="STREAMING|SYNC"/>
   debug: get_cluster_node_uuid:     Result section         <nvpair
id="nodes-node2.domain.com-standby" name="standby" value="off"/>
   debug: get_cluster_node_uuid:     Result section
</instance_attributes>
   debug: get_cluster_node_uuid:     Result section     </node>
   debug: get_cluster_node_uuid:     Result section     <node uname="
node1.domain.com" id="node1.domain.com">
   debug: get_cluster_node_uuid:     Result section
<instance_attributes id="nodes-node1.domain.com">
   debug: get_cluster_node_uuid:     Result section         <nvpair
id="nodes-node1.domain.com-postgres_msg-data-status"
name="postgres_msg-data-status" value="LATEST"/>
   debug: get_cluster_node_uuid:     Result section         <nvpair
id="nodes-node1.domain.com-standby" name="standby" value="off"/>
   debug: get_cluster_node_uuid:     Result section
</instance_attributes>
   debug: get_cluster_node_uuid:     Result section     </node>
   debug: get_cluster_node_uuid:     Result section   </nodes>
    info: query_node_uuid:     Mapped node1.domain.com to node1.domain.com
   trace: pick_ipc_buffer:     Using max message size of 51200
    info: attrd_update_delegate:     Connecting to cluster... 5 retries
remaining
   debug: qb_rb_open_2:     shm size:51213; real_size:53248;
rb->word_size:13312
   debug: qb_rb_open_2:     shm size:51213; real_size:53248;
rb->word_size:13312
   debug: qb_rb_open_2:     shm size:51213; real_size:53248;
rb->word_size:13312
   trace: crm_ipc_send:     Sending from client: attrd request id: 3 bytes:
168 timeout:5000 msg...
   trace: internal_ipc_get_reply:     client attrd waiting on reply to msg
id 3
   trace: crm_ipc_send:     Recieved response 3, size=88, rc=88, text: <ack
function="attrd_ipc_dispatch" line="129"/>
   debug: attrd_update_delegate:     Sent update: (null)=(null) for
node1.domain.com
    info: main:     Update (null)=<none> sent via attrd
   debug: cib_native_signoff:     Signing out of the CIB Service
   trace: mainloop_del_fd:     Removing client cib_rw[0x1fd79c0]
   trace: mainloop_gio_destroy:     Destroying client cib_rw[0x1fd79c0]
   trace: crm_ipc_close:     Disconnecting cib_rw IPC connection 0x1fdb020
(0x1fdb1a0.(nil))
   debug: qb_ipcc_disconnect:     qb_ipcc_disconnect()
   trace: qb_rb_close:     ENTERING qb_rb_close()
   debug: qb_rb_close:     Closing ringbuffer:
/dev/shm/qb-cib_rw-request-8347-9344-14-header
   trace: qb_rb_close:     ENTERING qb_rb_close()
   debug: qb_rb_close:     Closing ringbuffer:
/dev/shm/qb-cib_rw-response-8347-9344-14-header
   trace: qb_rb_close:     ENTERING qb_rb_close()
   debug: qb_rb_close:     Closing ringbuffer:
/dev/shm/qb-cib_rw-event-8347-9344-14-header
   trace: cib_native_destroy:     destroying 0x1fd7910
   trace: crm_ipc_destroy:     Destroying IPC connection to cib_rw:
0x1fdb020
   trace: mainloop_gio_destroy:     Destroyed client cib_rw[0x1fd79c0]
   trace: crm_exit:     cleaning up libxml
    info: crm_xml_cleanup:     Cleaning up memory from libxml2
   trace: crm_exit:     exit 0

I hope it helps.

2015-05-20 6:34 GMT+02:00 Andrew Beekhof <andrew at beekhof.net>:

>
> > On 4 May 2015, at 6:43 pm, Alexandre <alxgomz at gmail.com> wrote:
> >
> > Hi,
> >
> > I have a pacemaker / corosync / cman cluster running on redhat 6.6.
> > Although cluster is working as expected, I have some trace of old
> failures (several monthes ago) I can't gert rid of.
> > Basically I have set cluster-recheck-interval="300" and
> failure-timeout="600" (in rsc_defaults) as shown bellow:
> >
> > property $id="cib-bootstrap-options" \
> >     dc-version="1.1.10-14.el6-368c726" \
> >     cluster-infrastructure="cman" \
> >     expected-quorum-votes="2" \
> >     no-quorum-policy="ignore" \
> >     stonith-enabled="false" \
> >     last-lrm-refresh="1429702408" \
> >     maintenance-mode="false" \
> >     cluster-recheck-interval="300"
> > rsc_defaults $id="rsc-options" \
> >     failure-timeout="600"
> >
> > So I would expect old failure to be purged from the cib long ago, but
> actually I have the following when issuing crm_mon -frA1.
>
> I think automatic deletion didnt arrive until later.
>
> >
> > Migration summary:
> > * Node host1:
> >    etc_ml_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> >    spool_postfix_drbd_msg: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> >    lib_ml_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> >    lib_imap_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> >    spool_imap_drbd: migration-threshold=1000000 fail-count=11654
> last-failure='Sat Feb 14 17:04:05 2015'
> >    spool_ml_drbd: migration-threshold=1000000 fail-count=244
> last-failure='Sat Feb 14 17:04:05 2015'
> >    documents_drbd: migration-threshold=1000000 fail-count=248
> last-failure='Sat Feb 14 17:58:55 2015'
> > * Node host2
> >    documents_drbd: migration-threshold=1000000 fail-count=548
> last-failure='Sat Feb 14 16:26:33 2015'
> >
> > I have tried to crm_failcount -D the resources also tried cleanup... but
> it's still there!
>
> Oh?  Can you re-run with -VVVVVV and show us the result?
>
> > How can I get reid of those record (so my monitoring tools stop
> complaining) .
> >
> > Regards.
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://clusterlabs.org/pipermail/users/attachments/20150526/791bf447/attachment-0001.html>


More information about the Users mailing list