[Pacemaker] trouble with quorum
Andrew Beekhof
andrew at beekhof.net
Wed May 22 22:46:01 UTC 2013
On 22/05/2013, at 10:25 PM, Groshev Andrey <greenx at yandex.ru> wrote:
> Hello,
>
> I try build cluster with 2 nodes + one quorum node (without pacemaker).
This is the root of your problem.
Your config has:
> service {
> name: pacemaker
> ver: 1
> }
So even though you thought you only started corosync, you also started part of pacemaker.
Specifically the part of pacemaker that gets loaded into corosync to provides membership and _quorum_ APIs to the other daemons.
The output from corosync-quorumtool is completely irrelevant to pacemaker in this kind of setup.
Since you're on a RHEL derivative, I highly suggest using Pacemaker with CMAN (and updating to 6.4 while you're there :-).
In this case, the pacemaker daemons DO see the same quorum as corosync-quorumtool and your expectations would be correct.
Check out the quickstart: http://clusterlabs.org/quickstart-redhat.html
> The sequence of actions like the following:
>
> 1. setup/start corosync on TREE nodes - all right.
> # corosync-quorumtool -l|sed 's/\..*$//'
> Nodeid Votes Name
> 295521290 1 dev-cluster2-node2
> 312298506 1 dev-cluster2-node3
> 329075722 1 dev-cluster2-node4
>
> 2. start pacemaer on FIRST node.
> 3. write config with crmsh .... stonith-enabled="false"
> 4. .... no-quorum-policy="ignore"
> 5. write main config ocf:heartbeat:pgsql
> Like: https://github.com/t-matsuo/resource-agents/wiki/Resource-Agent-for-PostgreSQL-9.1-streaming-replication
> But with one VIP on master PG
> Resources are started on first node.
>
> 6. Next. Sync PG data with TWO node.
> 7. start pacemaker on TWO node. Resource started too.
> 8. no-quorum-policy="stop".
>
> Ok. All resources work on two nodes.
> See # corosync-quorumtool -l|sed 's/\..*$//'
> Nodeid Votes Name
> 295521290 1 dev-cluster2-node2
> 312298506 1 dev-cluster2-node3
> 329075722 1 dev-cluster2-node4
>
> # corosync-quorumtool -s
> Version: 1.4.5
> Nodes: 3
> Ring ID: 12440
> Quorum type: corosync_votequorum
> Quorate: Yes
> Node votes: 1
> Expected votes: 3
> Highest expected: 3
> Total votes: 3
> Quorum: 2
> Flags: Quorate
>
> See crm_mon.
> # crm_mon -1|grep quor
> Current DC: dev-cluster2-node3.unix.tensor.ru - partition with quorum
>
> Now, stop pacemaker on one node.
> #service pacemaker stop
>
> # corosync-quorumtool -s
> Version: 1.4.5
> Nodes: 3
> Ring ID: 12440
> Quorum type: corosync_votequorum
> Quorate: Yes
> Node votes: 1
> Expected votes: 3
> Highest expected: 3
> Total votes: 3
> Quorum: 2
> Flags: Quorate
>
> Now, on too node stop corosync.
> crm_mon - says he lost a quorum, but the resources are not stopped.
> crm_mon -1|grep quor
> Current DC: dev-cluster2-node4.unix.tensor.ru - partition WITHOUT quorum
>
> But corosync says that everything is fine ....
> # corosync-quorumtool -l|sed 's/\..*$//'
> Nodeid Votes Name
> 295521290 1 dev-cluster2-node2
> 329075722 1 dev-cluster2-node4
>
> # corosync-quorumtool -s
> Version: 1.4.5
> Nodes: 2
> Ring ID: 12440
> Quorum type: corosync_votequorum
> Quorate: Yes
> Node votes: 1
> Expected votes: 3
> Highest expected: 3
> Total votes: 2
> Quorum: 2
> Flags: Quorate
>
> Configs corosync:
> totem {
> version: 2
> secauth: off
> clear_node_high_bit: yes
> threads: 0
> interface {
> ringnumber: 0
> bindnetaddr: 10.76.157.18
> mcastaddr: 239.94.1.56
> mcastport: 5405
> ttl: 1
> }
> }
> logging {
> fileline: off
> to_stderr: no
> to_logfile: yes
> to_syslog: no
> logfile: /var/log/cluster/corosync.log
> debug: on
> timestamp: on
> logger_subsys {
> subsys: AMF
> debug: on
> }
> }
>
> amf {
> mode: disabled
> }
> service {
> name: pacemaker
> ver: 1
> }
> quorum {
> provider: corosync_votequorum
> expected_votes: 3
> votes: 1
> }
>
>
> Why this strange behavior?
>
> My environment:
> CentOS 6.3
> corosync 1.4.5 from opensuse-ha
> pacemaker 1.1.9 from http://clusterlabs.org/rpm-next/rhel-6/
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list