[Pacemaker] Cannot start VirtualDomain resource after restart

Kadlecsik József kadlecsik.jozsef at wigner.mta.hu
Wed Jun 20 13:48:15 EDT 2012


On Wed, 20 Jun 2012, Phil Frost wrote:

> Well, if the dot file you attached is the output of "crm_simulate -LS -D 
> pacemaker.dot", then this at least tells you that the policy engine, 
> given the current state of things, would like to do something else. 
> Normally when you run this you get an empty graph, because normally the 
> policy engine should be able to reach the target state that it 
> calculates. It looks like in your case it is not.

Yes, the attached file was created by the command you suggested.
 
> I really have no idea why lx0 is starting, but you can see from the 
> graph on what actions it depends. It's likely one of them is failing, 
> forcing the policy engine to recalculate its actions. We can see that 
> the policy engine thinks that in order to start lx0, it first has to 
> migrate mail0, then migrate caladan. It's also interesting that there 
> are a whole bunch of other migrations it thinks are necessary. Without 
> being intimately familiar with your environment it's hard to say if this 
> is expected or not. If you aren't expecting that, then you must have 
> some constraints configure that aren't what you intended.

Your crystal ball worked perfectly :-) - it was the memory utilization. 

It seems we hit some bug, because every node has got enough memory to run 
the resource, taking into account the node/resource memory settings. 
Still, lx0 was not started.

>From debian squeeze backports pacemaker 1.1.7 is available instead of our 
running version 1.1.6, but the changelog doesn't say anything about 
utilization related fixes.

Best regards,
Jozsef
--
E-mail : kadlecsik.jozsef at wigner.mta.hu
PGP key: http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics, Hungarian Academy of Sciences
         H-1525 Budapest 114, POB. 49, Hungary




More information about the Pacemaker mailing list