[Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work
Johan Huysmans
johan.huysmans at inuits.be
Thu Jun 12 12:53:52 UTC 2014
Hi All,
I deployed Pacemaker 1.1.12-rc2 on our platform to test the cib changes.
This was needed on our setup as it contains 6 nodes, 150 resources and
the cib process was using lots of cpu.
With a limited set of resources (6 nodes, 30 resources) everything
worked as expected, including crm_mon.
When loading the complete set of resources we lost the crm_mon
functionality on all nodes.
The cluster is running as expected (running all resources) however we
don't have any visibility.
I noticed that operations performing changes did actually work like (crm
resource stop <resourcename>),
but crm resource status didn't work (using crmsh-2.0+git46-1.1.x86_64).
I noticed that /dev/shm/qb-cib_ro* files are created, and lsof shows
that they are both opened by crm_mon and cib.
When executing "crm_mon -1" I get following messages in
/var/log/messages (and /var/log/pacemaker.log)
Jun 12 12:47:38 [8062] SRV-5-1 cib: notice: crm_ipcs_sendv:
Response 2 to 0x1810370[17836] (1091618 bytes) failed: Resource
temporarily unavailable (-11)
Jun 12 12:47:38 [8062] SRV-5-1 cib: warning: do_local_notify:
Sync reply to crm_mon failed: No message of desired type
Restarting the pacemaker and cman service of 1 node didn't solve it.
What is causing this problem and how can I resolve it ?
Thx,
Johan Huysmans
More information about the Pacemaker
mailing list