[Pacemaker] Stopping 300 Resources Causes Node To Go Offline

Thu Feb 2 21:31:43 UTC 2012

On Mon, Jan 30, 2012 at 11:40 AM, Gruen, Wolfgang <wgruen at idirect.net> wrote:
> We are running a cluster with 15 nodes and are running with 300 resources.
>
>
>
> *** Stopping 300 Resources Causes Node 2 To Go Offline
>
> Used the command cibadmin --replace --scope resources --xml-text
> "<resources/>"
> result was all running resources stopped, but node 2 went offline
>
> [root at pcs_linuxha_2 ~]# crm status
>
>
>
> Connection to cluster failed: connection failed
>
> [root at pcs_linuxha_2 ~]# /etc/init.d/pacemaker status
>
> pacemakerd dead but pid file exists
>
> [root at pcs_linuxha_2 ~]# /etc/init.d/corosync

Nowhere near enough information, sorry.
We need a crm_report tarball to be able to comment further.
Perhaps open a bug (bugs.clusterlabs.org) and attach it there for analysis.