[Pacemaker] pacemaker shutdown under high load
Alessandro Bono
alessandro.bono at gmail.com
Wed Oct 16 14:37:01 UTC 2013
On 16/10/2013 00:11, Andrew Beekhof wrote:
> On 09/10/2013, at 10:53 PM, Alessandro Bono <alessandro.bono at gmail.com> wrote:
>
>> Hi
>>
>>
>> this week end my pacemaker shutdown on primary node during machine backup
>> attached compressed log of primary node, logs of secondary node is too big, if needed I can provide as external link
>> inspecting logs I found these errors
> looks like corosync went away from underneath pacemaker, hence "Corosync connection lost! Exiting."
Is there a way to debug this problem? Nodes are regular centos 6.4 64bit
machine with this corosync version
corosync-1.4.1-15.el6_4.1.x86_64
corosynclib-1.4.1-15.el6_4.1.x86_64
Have I to package latest 1.4.x version and try it?
As a workaround I put in maintaince mode cluster prior to backup but
it's not a solution
>
>> Oct 05 22:26:46 [31338] ga1-ext cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=ga2-ext/crmd/17, version=0.155.87)
>> Oct 05 22:26:46 [31341] ga1-ext attrd: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
>> Oct 05 22:26:46 [31343] ga1-ext crmd: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
>> Oct 05 22:26:46 [31343] ga1-ext crmd: error: crmd_cs_destroy: connection terminated
>> Oct 05 22:26:46 [31343] ga1-ext crmd: debug: qb_ipcs_unref: qb_ipcs_unref() - destroying
>> Oct 05 22:26:47 [31343] ga1-ext crmd: info: qb_ipcs_us_withdraw: withdrawing server sockets
>> Oct 05 22:26:47 [31343] ga1-ext crmd: debug: qb_ipcc_disconnect: qb_ipcc_disconnect()
>> Oct 05 22:26:47 [31343] ga1-ext crmd: debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-attrd-request-31341-31343-9-header
>> Oct 05 22:26:46 [31332] ga1-ext pacemakerd: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
>> Oct 05 22:26:46 [31339] ga1-ext stonith-ng: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
>> Oct 05 22:26:46 [31338] ga1-ext cib: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2)
>> Oct 05 22:26:47 [31343] ga1-ext crmd: debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-attrd-response-31341-31343-9-header
>> Oct 05 22:26:47 [31332] ga1-ext pacemakerd: error: mcp_cpg_destroy: Connection destroyed
>> Oct 05 22:26:47 [31339] ga1-ext stonith-ng: error: stonith_peer_cs_destroy: Corosync connection terminated
>> Oct 05 22:26:47 [31339] ga1-ext stonith-ng: info: stonith_shutdown: Terminating with 1 clients
>> Oct 05 22:26:47 [31339] ga1-ext stonith-ng: debug: cib_native_signoff: Signing out of the CIB Service
>> Oct 05 22:26:47 [31339] ga1-ext stonith-ng: debug: qb_ipcc_disconnect: qb_ipcc_disconnect()
>> Oct 05 22:26:47 [31343] ga1-ext crmd: debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-attrd-event-31341-31343-9-header
>> Oct 05 22:26:47 [31341] ga1-ext attrd: crit: attrd_cs_destroy: Lost connection to Corosync service!
>> Oct 05 22:26:47 [31341] ga1-ext attrd: notice: main: Exiting...
>> Oct 05 22:26:47 [31341] ga1-ext attrd: notice: main: Disconnecting client 0x1b03990, pid=31343...
>> Oct 05 22:26:47 [31341] ga1-ext attrd: debug: qb_ipcs_disconnect: qb_ipcs_disconnect(31341-31343-9) state:2
>> Oct 05 22:26:47 [31341] ga1-ext attrd: info: crm_client_destroy: Destroying 0 events
>> Oct 05 22:26:47 [31338] ga1-ext cib: error: cib_cs_destroy: Corosync connection lost! Exiting.
>>
>> ps this is a resend to open a new thread, sorry for double mail
>>
>> --
>> Cordiali Saluti
>> Alessandro Bono
>>
>> <ga1-ext.corosync.log-20131006.gz>_______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
--
Cordiali Saluti
Alessandro Bono
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131016/fe5b69b7/attachment.htm>
More information about the Pacemaker
mailing list