[Pacemaker] 3rd node just for quorum
Klaus Darilion
klaus.mailinglists at pernau.at
Thu Jun 9 14:18:40 UTC 2011
Am 09.06.2011 01:05, schrieb Anton Altaparmakov:
> Hi Klaus,
>
> On 8 Jun 2011, at 22:21, Klaus Darilion wrote:
>> Hi!
>>
>> Currently I have a 2 node cluster and I want to add a 3rd node to use
>> quorum to avoid split brain.
>>
>> The service (DRBD+DB) should only run either on node1 or node2. Node3
>> can not provide the service - it should just help the other nodes to
>> find out if their network is broken or the other node's network is broken.
>>
>> Is my idea useful?
>
> Yes. That is what we do for all our Pacemake based setups.
>
>> How do I add such a "simple" 3rd node - just by using location
>> constraints for the service to be run only on node1 or node2?
>
> Here is an example:
>
> [...]
Hi Anton!
Thanks for toe config snippet. I try to add one thing after the other to
my config and I am already stuck without adding the 3rd node.
Currently I just have configured the DRBD resource and the filesystem
resource:
node db1-bh
node db2-bh
primitive drbd0 ocf:linbit:drbd \
params drbd_resource="r0" \
op monitor interval="15s"
primitive drbd0_fs ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/mnt" fstype="ext4"
group grp_database drbd0_fs
ms ms_drbd0 drbd0 \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
colocation database_on_drbd0 inf: grp_database ms_drbd0:Master
property $id="cib-bootstrap-options" \
dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
pe-error-series-max="100" \
pe-warn-series-max="100" \
pe-input-series-max="100"
rsc_defaults $id="rsc-options" \
resource-stickiness="5"
I start node 1. (node 2 is down). Here, the problem is, that the
filesystem can not be started, crm_mon shows:
============
Last updated: Thu Jun 9 16:12:35 2011
Stack: openais
Current DC: db1-bh - partition WITHOUT quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ db1-bh ]
OFFLINE: [ db2-bh ]
Master/Slave Set: ms_drbd0
Masters: [ db1-bh ]
Stopped: [ drbd0:1 ]
Failed actions:
drbd0_fs_start_0 (node=db1-bh, call=7, rc=1, status=complete):
unknown error
Analysing the logfile it seems that the filesystem primitive is started
before ms_drbd0 is promoted to Primary:
Jun 9 15:56:49 db1-bh pengine: [8667]: notice: clone_print:
Master/Slave Set: ms_drbd0
Jun 9 15:56:49 db1-bh pengine: [8667]: notice: short_print:
Slaves: [ db1-bh ]
Jun 9 15:56:49 db1-bh pengine: [8667]: notice: short_print:
Stopped: [ drbd0:1 ]
Jun 9 15:56:49 db1-bh pengine: [8667]: info: native_color: Resource
drbd0:1 cannot run anywhere
Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: Promoting
drbd0:0 (Slave db1-bh)
Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: ms_drbd0:
Promoted 1 instances of a possible 1 to master
Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: Promoting
drbd0:0 (Slave db1-bh)
Jun 9 15:56:49 db1-bh pengine: [8667]: info: master_color: ms_drbd0:
Promoted 1 instances of a possible 1 to master
...
Jun 9 15:56:49 db1-bh Filesystem[8865]: INFO: Running start for
/dev/drbd0 on /mnt
Jun 9 15:56:49 db1-bh lrmd: [8665]: info: RA output:
(drbd0_fs:start:stderr) FATAL: Module scsi_hostadapter not found.
...
Jun 9 15:56:49 db1-bh Filesystem[8865]: ERROR: Couldn't sucessfully
fsck filesystem for /dev/drbd0
...
Jun 9 15:56:50 db1-bh kernel: [21875.203353] block drbd0: role(
Secondary -> Primary )
I suspect that Pacemaker tells DRBD to promote the Secondary to Primary
and immediately starts the Filesystem primitive - before DRBD has
promoted the resource to Primary.
Any ideas how to solve this?
Thanks
Klaus
More information about the Pacemaker
mailing list