[ClusterLabs] Corosync: 100% cpu (corosync 2.3.5, libqb 0.17.1, pacemaker 1.1.13)
Pallai Roland
pallair at magex.hu
Thu Aug 6 13:54:38 UTC 2015
2015-08-06 15:24 GMT+02:00 Pallai Roland <pallair at magex.hu>:
> drbdtest1 corosync[4734]: [MAIN ] Corosync main process was not
>>> scheduled for 2590.4512 ms (threshold is 2400.0000 ms). Consider token
>>> timeout increase.
>>>
>>> and even drbd:
>>> drbdtest1 kernel: drbd p1: PingAck did not arrive in time.
>>>
>>
>> Kernel module blocked by unrelated userspace app?
>
>
> There is a chance that the nodes are blocking each other as they are on
> the same host and that is the reason of the DRBD timeout but it's also
> weird - how can a guest block an other entirely when there are idle cores
> on the host?
>
> All in all, DRBD timeout has been eliminated when a node got more than one
> logical core.
>
I have to correct myself;
DRBD timeout is not fixed if only one node has more cores. In this case the
other node will report PingAck timeout periodically. I think the most
simple explanation on this is a spinning corosync can block even kernel
threads.
DRBD timeout fixed if both nodes has more logical cores.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150806/aa7dd503/attachment.htm>
More information about the Users
mailing list