[Pacemaker] NFS resource isn't completely working
Lonni J Friedman
netllama at gmail.com
Tue Oct 16 21:30:55 UTC 2012
Greetings,
I'm trying to get an NFS server export to be correctly monitored &
managed by pacemaker, along with pre-existing IP, drbd and filesystem
mounts (which are working correctly). While NFS is up on the primary
node (along with the other services), the monitoring portion keeps
showing up as a failed action, reported as 'not running'.
Here's my current configuration:
################
node farm-ljf0 \
attributes standby="off"
node farm-ljf1
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="10.31.97.100" cidr_netmask="22" nic="eth1" \
op monitor interval="10s" \
meta target-role="Started"
primitive FS0 ocf:linbit:drbd \
params drbd_resource="r0" \
op monitor interval="10s" role="Master" \
op monitor interval="30s" role="Slave"
primitive FS0_drbd ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/mnt/sdb1" fstype="xfs" \
meta target-role="Started"
primitive FS0_nfs systemd:nfs-server \
op monitor interval="10s" \
meta target-role="Started"
group g_services ClusterIP FS0_drbd FS0_nfs
ms FS0_Clone FS0 \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
colocation fs0_on_drbd inf: g_services FS0_Clone:Master
order FS0_drbd-after-FS0 inf: FS0_Clone:promote g_services:start
property $id="cib-bootstrap-options" \
dc-version="1.1.8-2.fc16-394e906" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
################
Here's the output from 'crm status'
################
Last updated: Tue Oct 16 14:26:22 2012
Last change: Tue Oct 16 14:23:18 2012 via cibadmin on farm-ljf1
Stack: openais
Current DC: farm-ljf1 - partition with quorum
Version: 1.1.8-2.fc16-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.
Online: [ farm-ljf0 farm-ljf1 ]
Master/Slave Set: FS0_Clone [FS0]
Masters: [ farm-ljf1 ]
Slaves: [ farm-ljf0 ]
Resource Group: g_services
ClusterIP (ocf::heartbeat:IPaddr2): Started farm-ljf1
FS0_drbd (ocf::heartbeat:Filesystem): Started farm-ljf1
FS0_nfs (systemd:nfs-server): Started farm-ljf1
Failed actions:
FS0_nfs_monitor_10000 (node=farm-ljf1, call=54357, rc=7,
status=complete): not running
FS0_nfs_monitor_10000 (node=farm-ljf0, call=131365, rc=7,
status=complete): not running
################
When I check the cluster log, I'm seeing a bunch of this stuff:
#############
Oct 16 14:23:17 [924] farm-ljf0 attrd: notice:
attrd_trigger_update: Sending flush op to all hosts for:
fail-count-FS0_nfs (11939)
Oct 16 14:23:17 [924] farm-ljf0 attrd: notice:
attrd_trigger_update: Sending flush op to all hosts for:
probe_complete (true)
Oct 16 14:23:17 [924] farm-ljf0 attrd: notice:
attrd_ais_dispatch: Update relayed from farm-ljf1
Oct 16 14:23:17 [924] farm-ljf0 attrd: notice:
attrd_trigger_update: Sending flush op to all hosts for:
fail-count-FS0_nfs (11940)
Oct 16 14:23:17 [924] farm-ljf0 attrd: notice:
attrd_perform_update: Sent update 25471: fail-count-FS0_nfs=11940
Oct 16 14:23:17 [924] farm-ljf0 attrd: notice:
attrd_ais_dispatch: Update relayed from farm-ljf1
Oct 16 14:23:20 [923] farm-ljf0 lrmd: info:
cancel_recurring_action: Cancelling operation FS0_nfs_status_10000
Oct 16 14:23:20 [926] farm-ljf0 crmd: info:
process_lrm_event: LRM operation FS0_nfs_monitor_10000 (call=131365,
status=1, cib-update=0, confirmed=false) Cancelled
Oct 16 14:23:20 [923] farm-ljf0 lrmd: info:
systemd_unit_exec_done: Call to stop passed: type '(o)'
/org/freedesktop/systemd1/job/1062961
Oct 16 14:23:20 [926] farm-ljf0 crmd: notice:
process_lrm_event: LRM operation FS0_nfs_stop_0 (call=131369, rc=0,
cib-update=35842, confirmed=true) ok
#############
I'm not sure what any of that means. I'd appreciate some guidance.
thanks!
More information about the Pacemaker
mailing list