[Pacemaker] quorum lost, Multiple primaries not allowed by config

Mistina Michal Michal.Mistina at virte.sk
Mon Nov 11 11:35:31 EST 2013


Hi Digimer,
Thank you for the reply.
I know the stonith is crutial for avoiding split-brain situations. However I
don't have this option, because I am running servers on the ESX as virtual
machines. There is only one option - using stonith agent fence_vmware_soap.
But the connection from vmware machines to the ESX management is blocked in
our environment.
I'd like to know if the drbd failed because it was searching for stonith
device which it can use for fence the other node. Or is it something else,
some other source of this behaviour, what can be seen from the logs?

Best regards,
Michal Mistina

-----Original Message-----
From: Digimer [mailto:lists at alteeve.ca] 
Sent: Monday, November 11, 2013 4:31 PM
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] quorum lost, Multiple primaries not allowed by
config

On 11/11/13 02:56, Mistina Michal wrote:
> Dear all.
> 
> I'd like to know why is our cluster loosing quorum, even if there is 
> no-quorum-policy="ignore",  and why the drbd is unsuccesfull in 
> promoting primary on the node where it should run according to crm_mon 
> and logs. The logs (/var/log/messages) are saying there is a problem 
> with promoting, but drbd according to /proc/drbd is fine and correctly 
> promoted as Secondary on slave and correctly promoted as a Primary on 
> the master. Is it possible that there are wrong timouts? Or drbd is 
> failing for some reason? Maybe because wrong lvm filter defined?

>         stonith-enabled="false" \
>         no-quorum-policy="ignore" \

You have disabled fencing and quorum. With DRBD, fencing is critical and
must be enabled (and tested to work). Without it, you're asking for a
split-brain which can lead to data loss.

Please configure stonith (same thing as fencing) in pacemaker, make sure
that it works, then change DRBD's fencing-policy to 'resource-and-stonith'
and set the fence handler to use 'crm-fence-peer.sh'. This way, when the
nodes split, one will power off the other and then be able to safely promote
to primary and recover any lost services.

If your nodes have IPMI (or iLO, iDRAC, RSA, etc), you can use
fence_ipmilan. Be sure to disable acpid and add a 'delay="15"' to the node
your primary node. This way, should a partition happen, the primary node has
a 15 second head start in fencing the secondary node.

--
Digimer
Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is
trapped in the mind of a person without access to education?

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3076 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131111/9f45ac46/attachment-0003.p7s>


More information about the Pacemaker mailing list