[Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

Andrew Beekhof andrew at beekhof.net
Mon Feb 17 19:49:30 EST 2014


On 31 Jan 2014, at 6:20 pm, yusuke iida <yusk.iida at gmail.com> wrote:

> Hi, all
> 
> I measure the performance of Pacemaker in the following combinations.
> Pacemaker-1.1.11.rc1
> libqb-0.16.0
> corosync-2.3.2
> 
> All nodes are KVM virtual machines.
> 
>  stopped the node of vm01 compulsorily from the inside, after starting 14 nodes.
> "virsh destroy vm01" was used for the stop.
> Then, in addition to the compulsorily stopped node, other nodes are separated from a cluster.
> 
> The log of "Retransmit List:" is then outputted in large quantities from corosync.

Probably best to poke the corosync guys about this.

However, <= .11 is known to cause significant CPU usage with that many nodes.
I can easily imagine this staving corosync of resources and causing breakage.

I would _highly_ recommend retesting with the current git master of pacemaker.
I merged the new cib code last week which is faster by _two_ orders of magnitude and uses significantly less CPU.

I'd be interested to hear your feedback.

> 
> What is the reason which the node in which failure has not occurred carries out "lost"?
> 
> Please advise, if there is a problem in a setup in something.
> 
> I attached the report when the problem occurred.
> https://drive.google.com/file/d/0BwMFJItoO-fVMkFWWWlQQldsSFU/edit?usp=sharing
> 
> Regards,
> Yusuke
> -- 
> ---------------------------------------- 
> METRO SYSTEMS CO., LTD 
> 
> Yusuke Iida 
> Mail: yusk.iida at gmail.com
> ---------------------------------------- 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140218/7aa23e42/attachment-0002.sig>


More information about the Pacemaker mailing list