[Pacemaker] Pacemaker 1.1.12 cib testing, crm_mon doesn't work
Johan Huysmans
johan.huysmans at inuits.be
Fri Jun 13 10:25:36 CEST 2014
Hi,
I performed some extra testing.
I cleared my complete cib (cibadmin -E) and my crm_mon showed again some
information.
I gradually started adding resources and monitored the cib.xml size
(cibadmin -Ql > cib.xml; ll -h cib.xml).
This grew to about 430K. When adding another bunch of resources the
crm_mon didn't respond anymore.
I checked the files in /dev/shm:
-rw------- 1 hacluster root 516K jun 13 08:14
qb-cib_ro-event-58122-10738-13-data
-rw------- 1 hacluster root 8,1K jun 13 08:14
qb-cib_ro-event-58122-10738-13-header
-rw------- 1 hacluster root 516K jun 13 08:14
qb-cib_ro-request-58122-10738-13-data
-rw------- 1 hacluster root 8,1K jun 13 08:14
qb-cib_ro-request-58122-10738-13-header
-rw------- 1 hacluster root 516K jun 13 08:14
qb-cib_ro-response-58122-10738-13-data
-rw------- 1 hacluster root 8,1K jun 13 08:14
qb-cib_ro-response-58122-10738-13-header
-rw------- 1 hacluster root 96M jun 13 07:58
qb-cib_rw-event-58122-58130-12-data
-rw------- 1 hacluster root 8,1K jun 13 07:58
qb-cib_rw-event-58122-58130-12-header
-rw------- 1 hacluster root 96M jun 13 08:13
qb-cib_rw-request-58122-58130-12-data
-rw------- 1 hacluster root 8,1K jun 13 07:58
qb-cib_rw-request-58122-58130-12-header
-rw------- 1 hacluster root 96M jun 13 07:58
qb-cib_rw-response-58122-58130-12-data
-rw------- 1 hacluster root 8,1K jun 13 07:58
qb-cib_rw-response-58122-58130-12-header
Apparantly the cib_rw*data files are 96M size, but the cib_ro files are
only 516K.
If this file must store the complete cib it is too small to store our
complete cib,
which could explain why our crm_mon isn't working but the write actions
are giving no problems.
gr.
Johan
On 13-06-14 09:35, Johan Huysmans wrote:
> Hi,
>
> The PCMK_ipc_buffer was already set to 10000000 (10M).
>
> For testing I increased to buffer to 10000000 (100M), without results.
>
> I decreased the buffer to the default 20480 (20K) as it then shows the
> suggested value.
> If I leave it running for a couple of minutes I got these suggested
> values:
> Jun 13 06:33:53 [62206] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (3854984
> bytes suggested)
> Jun 13 06:33:54 [62206] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (7709968
> bytes suggested)
> Jun 13 06:33:58 [62206] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (15419936
> bytes suggested)
> Jun 13 06:35:55 [62206] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (30839872
> bytes suggested)
> Jun 13 06:37:56 [62206] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (61679744
> bytes suggested)
> Jun 13 07:16:47 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (3952852
> bytes suggested)
> Jun 13 07:16:47 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (7905704
> bytes suggested)
> Jun 13 07:18:03 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (15811408
> bytes suggested)
> Jun 13 07:18:43 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (31622816
> bytes suggested)
> Jun 13 07:18:48 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (63245632
> bytes suggested)
> Jun 13 07:20:49 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (126491264
> bytes suggested)
> Jun 13 07:22:50 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (252982528
> bytes suggested)
> Jun 13 07:22:52 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (505965056
> bytes suggested)
> Jun 13 07:23:57 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (1011930112
> bytes suggested)
> Jun 13 07:24:51 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (2023860224
> bytes suggested)
> Jun 13 07:26:52 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (-247246848
> bytes suggested)
> Jun 13 07:27:22 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (-494493696
> bytes suggested)
> Jun 13 07:29:22 [44112] SRV-5-1 cib: error: crm_ipc_prepare:
> Could not compress the message into less than the configured ipc
> limit (20480 bytes).Set PCMK_ipc_buffer to a higher value (-988987392
> bytes suggested)
>
> There is definitely something wrong. Is it the printing of the
> suggested value or is it something else ?
>
> If I check the cib.xml files in /var/lib/pacemaker/cib/ all files are
> a bit smaller then 300K.
>
> Changing these buffers did not solve my problem not getting results
> from crm_mon.
>
> Gr.
> Johan
>
>
> On 13-06-14 01:13, Andrew Beekhof wrote:
>> On 12 Jun 2014, at 10:53 pm, Johan Huysmans<johan.huysmans at inuits.be> wrote:
>>
>>> Hi All,
>>>
>>> I deployed Pacemaker 1.1.12-rc2 on our platform to test the cib changes.
>>> This was needed on our setup as it contains 6 nodes, 150 resources and the cib process was using lots of cpu.
>>>
>>> With a limited set of resources (6 nodes, 30 resources) everything worked as expected, including crm_mon.
>>> When loading the complete set of resources we lost the crm_mon functionality on all nodes.
>>> The cluster is running as expected (running all resources) however we don't have any visibility.
>>>
>>> I noticed that operations performing changes did actually work like (crm resource stop <resourcename>),
>>> but crm resource status didn't work (using crmsh-2.0+git46-1.1.x86_64).
>>>
>>> I noticed that /dev/shm/qb-cib_ro* files are created, and lsof shows that they are both opened by crm_mon and cib.
>>>
>>>
>>> When executing "crm_mon -1" I get following messages in /var/log/messages (and /var/log/pacemaker.log)
>>> Jun 12 12:47:38 [8062] SRV-5-1 cib: notice: crm_ipcs_sendv: Response 2 to 0x1810370[17836] (1091618 bytes) failed: Resource temporarily unavailable (-11)
>>> Jun 12 12:47:38 [8062] SRV-5-1 cib: warning: do_local_notify: Sync reply to crm_mon failed: No message of desired type
>>>
>>>
>>> Restarting the pacemaker and cman service of 1 node didn't solve it.
>>>
>>>
>>> What is causing this problem and how can I resolve it ?
>> Almost certainly you're hitting IPC limits associated with large clusters.
>>
>> You should be able to tune:
>>
>> # PCMK_ipc_buffer=20480
>>
>> In /etc/sysconfig/pacemaker and then restart the cluster.
>>
>> Note also:
>>
>> # For non-systemd based systems, prefix 'export' to each enabled line
>>
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list:Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home:http://www.clusterlabs.org
>> Getting started:http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs:http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20140613/c0031f29/attachment.html>
More information about the Pacemaker
mailing list