[Pacemaker] Drbd/Nfs MS don't failover on slave node
Guillaume Chanaud
guillaume.chanaud at connecting-nature.com
Mon Jul 5 10:41:05 UTC 2010
Hello,
i searched the list, tried lots of things but nothing works, so i try to
post here.
I'd like to say my configuration worked on heartbeat2/crm, but since i
migrated to corosync/pacemaker i have a problem.
Here is my cib :
node filer1 \
attributes standby="off"
node filer2 \
attributes standby="off"
primitive drbd_nfs ocf:linbit:drbd \
params drbd_resource="r0" \
op monitor interval="15s" timeout="60"
primitive fs_nfs ocf:heartbeat:Filesystem \
op monitor interval="120s" timeout="60s" \
params device="/dev/drbd0" directory="/data" fstype="ext4"
primitive ip_failover heartbeat:OVHfailover.py \
op monitor interval="120s" timeout="60s" \
params 1="cgXXXX-ovh" 2="******" 3="*****.ovh.net" 4="ip.ip.ip.ip"
primitive ip_nfs ocf:heartbeat:IPaddr2 \
op monitor interval="60s" timeout="20s" \
params ip="192.168.0.20" cidr_netmask="24" nic="vlan2019"
primitive nfs_server lsb:nfs \
op monitor interval="120s" timeout="60s"
group group_nfs ip_nfs fs_nfs nfs_server ip_failover \
meta target-role="Started"
ms ms_drbd_nfs drbd_nfs \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Master"
colocation nfs_on_drbd inf: group_nfs ms_drbd_nfs:Master
order nfs_after_drbd inf: ms_drbd_nfs:promote group_nfs:start
property $id="cib-bootstrap-options" \
symmetric-cluster="true" \
no_quorum-policy="stop" \
default-resource-stickiness="0" \
default-resource-failure-stickiness="0" \
stonith-enabled="false" \
stonith-action="reboot" \
stop-orphan-resources="true" \
stop-orphan-actions="true" \
remove-after-stop="false" \
short-resource-names="true" \
transition-idle-timeout="3min" \
default-action-timeout="30s" \
is-managed-default="true" \
startup-fencing="true" \
cluster-delay="60s" \
expected-nodes="1" \
election_timeout="50s" \
expected-quorum-votes="2" \
dc-version="1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7" \
cluster-infrastructure="openais"
So i have a DRBD ressource set as master/slave. And i have a group with
OVHfailover (a custom script i made to migrate a failover ip for my
hosting provide, this one works without problem)), Filesystem (to mount
the drbd0) NFS (to start nfs server) and IPaddr2 (to attach an ip in a
vlan).
Now i start my two nodes :
#crm_mon
============
Last updated: Mon Jul 5 12:24:04 2010
Stack: openais
Current DC: filer1.connecting-nature.com - partition with quorum
Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ filer2 filer1 ]
Resource Group: group_nfs
ip_nfs (ocf::heartbeat:IPaddr2): Started filer1
fs_nfs (ocf::heartbeat:Filesystem): Started filer1
nfs_server (lsb:nfs): Started filer1
ip_failover (heartbeat:OVHfailover.py): Started filer1
Master/Slave Set: ms_drbd_nfs
Masters: [ filer1 ]
Slaves: [ filer2 ]
Everything's fine.
Now, i stop the filer1
#/etc/init.d/corosync stop
It stops correctly
but in crm_mon:
============
Last updated: Mon Jul 5 11:28:59 2010
Stack: openais
Current DC: filer1.connecting-nature.com - partition WITHOUT quorum
Version: 1.0.8-9881a7350d6182bae9e8e557cf20a3cc5dac3ee7
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ filer2 ]
OFFLINE: [ filer1 ]
And nothing happens (ressources doesn't migrate to filer2 which is
online, in fact, like pasted above, they doesn't appears)
Now if i restart filer1, ressouce will migrate to filer2 and start on
filer2 once filer1 restarted..
I don't know where is my mistake. I tried lots of several config, but
each time nothing happen. In worst case
it starts an infinite loop where filer1 try to promote, then stop, the
filer2 try to promote then stop, again and again (one loop takes ~1sec)
Thanks for your help
Guillaume
More information about the Pacemaker
mailing list