[Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work

Johan Huysmans johan.huysmans at inuits.be
Thu Jun 12 12:53:52 UTC 2014


Hi All,

I deployed Pacemaker 1.1.12-rc2 on our platform to test the cib changes.
This was needed on our setup as it contains 6 nodes, 150 resources and 
the cib process was using lots of cpu.

With a limited set of resources (6 nodes, 30 resources) everything 
worked as expected, including crm_mon.
When loading the complete set of resources we lost the crm_mon 
functionality on all nodes.
The cluster is running as expected (running all resources) however we 
don't have any visibility.

I noticed that operations performing changes did actually work like (crm 
resource stop <resourcename>),
but crm resource status didn't work (using crmsh-2.0+git46-1.1.x86_64).

I noticed that /dev/shm/qb-cib_ro* files are created, and lsof shows 
that they are both opened by crm_mon and cib.


When executing "crm_mon -1" I get following messages in 
/var/log/messages (and /var/log/pacemaker.log)
Jun 12 12:47:38 [8062] SRV-5-1        cib:   notice: crm_ipcs_sendv:     
Response 2 to 0x1810370[17836] (1091618 bytes) failed: Resource 
temporarily unavailable (-11)
Jun 12 12:47:38 [8062] SRV-5-1        cib:  warning: do_local_notify: 
     Sync reply to crm_mon failed: No message of desired type


Restarting the pacemaker and cman service of 1 node didn't solve it.


What is causing this problem and how can I resolve it ?


Thx,
Johan Huysmans




More information about the Pacemaker mailing list