[Pacemaker] Intermittent Failovers: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)

Andrew Beekhof andrew at beekhof.net
Mon Nov 17 06:44:12 UTC 2014


> On 11 Nov 2014, at 1:32 am, Zach Wolf <ZWolf at doublepositive.com> wrote:
> 
> Hey Team,
> 
> I’m receiving some strange intermittent failovers on a two-node cluster (happens once every week or two). When this happens, both nodes are unavailable; one node will be marked offline and the other will be shown as unclean. Any help on this would be massively appreciated. Thanks.
> 
> Running Ubuntu 12.04 (64-bit)
> Pacemaker 1.1.6-2ubuntu3.3
> Corosync 1.4.2-2ubuntu0.2
> 
> Here are the logs:
> Nov 08 14:26:26 corosync [pcmk  ] info: pcmk_ipc_exit: Client crmd (conn=0x12bebe0, async-conn=0x12bebe0) left
> Nov 08 14:26:26 corosync [pcmk  ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
> Nov 08 14:26:27 corosync [pcmk  ] info: pcmk_ipc_exit: Client attrd (conn=0x12d0230, async-conn=0x12d0230) left
> Nov 08 14:26:32 corosync [pcmk  ] info: pcmk_ipc_exit: Client cib (conn=0x12c7d80, async-conn=0x12c7d80) left
> Nov 08 14:26:32 corosync [pcmk  ] info: pcmk_ipc_exit: Client stonith-ng (conn=0x12c3a20, async-conn=0x12c3a20) left
> Nov 08 14:26:32 corosync [pcmk  ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2)
> Nov 08 14:26:32 corosync [pcmk  ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2)

Nothing at all from the crmd, cib, attrd or stonith-ng processes?

> Nov 08 14:26:32 corosync [pcmk  ] info: pcmk_ipc: Recorded connection 0x12bebe0 for stonith-ng/0
> Nov 08 14:26:32 corosync [pcmk  ] info: pcmk_ipc: Recorded connection 0x12c2f40 for attrd/0
> Nov 08 14:26:33 corosync [pcmk  ] info: pcmk_ipc: Recorded connection 0x12c72a0 for cib/0
> Nov 08 14:26:33 corosync [pcmk  ] info: pcmk_ipc: Sending membership update 12 to cib
> Nov 08 14:26:33 corosync [pcmk  ] info: pcmk_ipc: Recorded connection 0x12cb600 for crmd/0
> Nov 08 14:26:33 corosync [pcmk  ] info: pcmk_ipc: Sending membership update 12 to crmd
> 
> Output of crm configure show:
> node p-sbc3 \
>         attributes standby="off"
> node p-sbc4 \
>         attributes standby="off"
> primitive fs lsb:FSSofia \
>         op monitor interval="2s" enabled="true" timeout="10s" on-fail="standby" \
>         meta target-role="Started"
> primitive fs-ip ocf:heartbeat:IPaddr2 \
>         params ip="10.100.0.90" nic="eth0:0" cidr_netmask="24" \
>         op monitor interval="10s"
> primitive fs-ip2 ocf:heartbeat:IPaddr2 \
>         params ip="10.100.0.99" nic="eth0:1" cidr_netmask="24" \
>         op monitor interval="10s"
> group cluster_services fs-ip fs-ip2 fs \
>         meta target-role="Started"
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="2" \
>         stonith-enabled="false" \
>         last-lrm-refresh="1348755080" \
>         no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="100"
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list