[Pacemaker] What is the reason which the node in which failure has not occurred carries out "lost"?

Thu Mar 6 08:17:06 EST 2014

On Thu, 06 Mar 2014 14:39:46 +0300
Vladislav Bogdanov <bubble at hoster-ok.com> wrote:

> > Probably best to poke the corosync guys about this.
> > 
> > However, <= .11 is known to cause significant CPU usage with that
> > many nodes. I can easily imagine this staving corosync of resources
> > and causing breakage.
> > 
> > I would _highly_ recommend retesting with the current git master of
> > pacemaker. I merged the new cib code last week which is faster by
> > _two_ orders of magnitude and uses significantly less CPU.  
> 
> Andrew, current git master (ee094a2) almost works, the only issue is
> that crm_diff calculates incorrect diff digest. If I replace digest in
> diff by hands with what cib calculates as "expected". it applies
> correctly. Otherwise - -206.

Ah! This sounds like the same issue that I am seeing with crmsh.

-- 
// Kristoffer Grönlund
// kgronlund at suse.com