[Pacemaker] failed over filesystem mount points not coming up on secondary node

Mon Oct 1 21:41:31 UTC 2012

On Mon, Oct 1, 2012 at 2:14 PM, Jake Smith <jsmith at argotec.com> wrote:
> ----- Original Message -----
>> From: "Lonni J Friedman" <netllama at gmail.com>
>> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
>> Sent: Monday, October 1, 2012 4:31:05 PM
>> Subject: Re: [Pacemaker] failed over filesystem mount points not coming up    on secondary node
>>
>> I'm still dead in the water here, and could really use some clues.
>>
>> I tried tweaking my config a bit to simplify it, in the hope that it
>> would at least work with fewer resources, but that too fails in the
>> exact same fashion.  Specifically, the DRBD resource does failover to
>> promote the old slave to a master, but the failover IP never gets
>> promoted, and the DRBD backed block device is never mounted on the
>> new
>> master.
>>
>> location cli-prefer-ClusterIP ClusterIP \
>>       rule $id="cli-prefer-rule-ClusterIP" inf: #uname eq farm-ljf1
>
> This location constraint prevents ClusterIP from running on a node that isn't named farm-ljf1 because it has a score of infinity.  If you want the preference to be node farm-ljf1 then set it to something like 100:.
>
>> colocation fs0_on_drbd inf: g_services FS0_Clone:Master
>> order FS0_drbd-after-FS0 inf: FS0_Clone:promote g_services
>
> When you specify actions for a resource in an order statement they are inherited by all the remaining resources unless explicitly defined - so this ends up being:
> order FS0__drbd-after-FS0 inf: FS0_Clone:promote g_services:promote
>
> Can't promote the resources that are part of the g_services group (not supported action).  Should change this to be:
> order FS0_drbd-after-FS0 inf: FS0_Clone:promote g_services:start

Thanks so much for your (fast) reply.  That indeed did the trick, and
everything is working (and failing over) as expected now.  For
posterity, here's the corrected pacemaker configuration which works:
########
[root at farm-ljf1 ~]# crm configure show
node farm-ljf0 \
	attributes standby="off"
node farm-ljf1
primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="10.31.97.100" cidr_netmask="22" nic="eth1" \
	op monitor interval="10s" \
	meta target-role="Started"
primitive FS0 ocf:linbit:drbd \
	params drbd_resource="r0" \
	op monitor interval="10s" role="Master" \
	op monitor interval="30s" role="Slave"
primitive FS0_drbd ocf:heartbeat:Filesystem \
	params device="/dev/drbd0" directory="/mnt/sdb1" fstype="xfs" \
	meta target-role="Started"
group g_services FS0_drbd ClusterIP
ms FS0_Clone FS0 \
	meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
colocation fs0_on_drbd inf: g_services FS0_Clone:Master
order FS0_drbd-after-FS0 inf: FS0_Clone:promote g_services:start
property $id="cib-bootstrap-options" \
	dc-version="1.1.7-2.fc16-ee0730e13d124c3d58f00016c3376a1de5323cff" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore"
########

thanks!