[Pacemaker] quorum lost, Multiple primaries not allowed by config

Digimer lists at alteeve.ca
Mon Nov 11 10:30:32 EST 2013


On 11/11/13 02:56, Mistina Michal wrote:
> Dear all.
> 
> I’d like to know why is our cluster loosing quorum, even if there is
> no-quorum-policy="ignore",  and why the drbd is unsuccesfull in
> promoting primary on the node where it should run according to crm_mon
> and logs. The logs (/var/log/messages) are saying there is a problem
> with promoting, but drbd according to /proc/drbd is fine and correctly
> promoted as Secondary on slave and correctly promoted as a Primary on
> the master. Is it possible that there are wrong timouts? Or drbd is
> failing for some reason? Maybe because wrong lvm filter defined?

>         stonith-enabled="false" \
>         no-quorum-policy="ignore" \

You have disabled fencing and quorum. With DRBD, fencing is critical and
must be enabled (and tested to work). Without it, you're asking for a
split-brain which can lead to data loss.

Please configure stonith (same thing as fencing) in pacemaker, make sure
that it works, then change DRBD's fencing-policy to
'resource-and-stonith' and set the fence handler to use
'crm-fence-peer.sh'. This way, when the nodes split, one will power off
the other and then be able to safely promote to primary and recover any
lost services.

If your nodes have IPMI (or iLO, iDRAC, RSA, etc), you can use
fence_ipmilan. Be sure to disable acpid and add a 'delay="15"' to the
node your primary node. This way, should a partition happen, the primary
node has a 15 second head start in fencing the secondary node.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?




More information about the Pacemaker mailing list