[Pacemaker] corosync crash

u.schmeling at online.de u.schmeling at online.de
Thu Feb 24 04:32:55 EST 2011


An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110224/37692f3f/attachment-0002.html>
-------------- next part --------------

Hi,

my configuration has 2 nodes, one has a set of virtual adresses and a webservice. The situation before crash:
node1: has all resources
node2: online, no resources

action on node2: crm standby node2
result on node1: corosync crashes, the child processes consume all available cpu time

my actions: stop all child processes on node1 (kill -9) and restart corosync

result on node1:
node1: online, all resources
node2: offline

result on node2:
node1: offline
node2: online, all resources

The only way I found to workaround this problem: remove node2 from the cluster and add it again.
There should be other solutions, maybe someone can help. Appended the coredump and fplay.

Update: If I keep the cluster in the split brain state, it recovers after about 9 hours (logfile available)

regards Uwe
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: coredump.txt
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110224/37692f3f/attachment-0002.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fplay.txt.gz
Type: application/x-gzip
Size: 123122 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110224/37692f3f/attachment.bin>


More information about the Pacemaker mailing list