[ClusterLabs] corosync eats the whole CPU core in epoll_wait() on one node in cluster
Vladislav Bogdanov
bubble at hoster-ok.com
Fri May 29 12:25:15 UTC 2015
Hi,
Just noticed subj on just one node in 4-node cluster.
I've dumped blackbox logs, but unfortunately that didn't help me to
understand what's going on because even debug logs are too slender.
strace on a running process doesn't show anything except epoll_wait.
...
epoll_wait(4, {{EPOLLIN, {u32=19, u64=3703511490016313363}}}, 12, 107) = 1
epoll_wait(4, {{EPOLLIN, {u32=19, u64=3703511490016313363}}}, 12, 107) = 1
epoll_wait(4, {{EPOLLIN, {u32=19, u64=3703511490016313363}}}, 12, 107) = 1
epoll_wait(4, {{EPOLLIN, {u32=19, u64=3703511490016313363}}}, 12, 107) = 1
...
But that ones are way to frequent:
# timeout 10 strace -p 2177 2>&1 | grep EPOLLIN >/tmp/corosync-epoll.log
Terminated
# wc -l /tmp/corosync-epoll.log
438399 /tmp/corosync-epoll.log
that means: ~43840 times per second.
Other nodes show zero.
Pacemaker DC is on the another node.
Nodes are completely identical.
fd 19 which generates that events is shown in lsof this way:
corosync 2177 root 19u unix 0xffff88062f896680 0t0
17987 socket
netstat for that inode (17987) shows:
unix 3 [ ] STREAM CONNECTED 17987 2177/corosync
@cpg
So that socket is used by CPG.
nearest socket inode (connecting one, 17986) is used by pacemakerd.
strace of pacemakerd shows absolutely normal
poll([{fd=8, events=POLLIN}, {fd=6, events=POLLIN}, {fd=5,
events=POLLIN}, {fd=4, events=POLLIN|POLLPRI}], 4, 500) = 0 (Timeout)
poll([{fd=8, events=POLLIN}, {fd=6, events=POLLIN}, {fd=5,
events=POLLIN}, {fd=4, events=POLLIN|POLLPRI}], 4, 500) = 0 (Timeout)
poll([{fd=8, events=POLLIN}, {fd=6, events=POLLIN}, {fd=5,
events=POLLIN}, {fd=4, events=POLLIN|POLLPRI}], 4, 500) = 0 (Timeout)
So, this looks like a defect, but where?
libqb seems to be the main suspect, but I'm not sure.
That is centos6, corosync 53f67a2 on top of libqb 0.17.1 (recompile of
David's 0.17.1-1 dated Tue Aug 26 2014).
Pacemaker is fbc239b.
Best,
Vladislav
More information about the Users
mailing list