[ClusterLabs] Failed actions .. constraint confusion
Ken Gaillot
kgaillot at redhat.com
Tue Nov 7 10:10:02 EST 2017
On Mon, 2017-11-06 at 19:55 -0800, Aaron Cody wrote:
> Hello
> I have set up an active/passive HA NFS/DRBD cluster ... on RHEL7.2
> ... and I keep getting this 'Failed Action' message .. not always,
> but sometimes.... :
>
> Stack: corosync
> Current DC: ha-nfs2.lan.aaroncody.com (version 1.1.16-12.el7_4.4-
> 94ff4df) - partition with quorum
> Last updated: Mon Nov 6 22:52:28 2017
> Last change: Mon Nov 6 22:47:20 2017 by hacluster via crmd on ha-
> nfs2.lan.aaroncody.com
>
> 2 nodes configured
> 8 resources configured
>
> Online: [ ha-nfs1.lan.aaroncody.com ha-nfs2.lan.aaroncody.com ]
>
> Full list of resources:
>
> Master/Slave Set: nfs-drbd-clone [nfs-drbd]
> Masters: [ ha-nfs2.lan.aaroncody.com ]
> Slaves: [ ha-nfs1.lan.aaroncody.com ]
> nfs-filesystem (ocf::heartbeat:Filesystem): Started ha-
> nfs2.lan.aaroncody.com
> nfs-root (ocf::heartbeat:exportfs): Started ha-
> nfs2.lan.aaroncody.com
> nfs-export1 (ocf::heartbeat:exportfs): Started ha-
> nfs2.lan.aaroncody.com
> nfs-server (ocf::heartbeat:nfsserver): Started ha-
> nfs2.lan.aaroncody.com
> nfs-ip (ocf::heartbeat:IPaddr2): Started ha-
> nfs2.lan.aaroncody.com
> nfs-notify (ocf::heartbeat:nfsnotify): Started ha-
> nfs2.lan.aaroncody.com
>
> Failed Actions:
> * nfs-server_start_0 on ha-nfs1.lan.aaroncody.com 'unknown error'
> (1): call=40, status=complete, exitreason='Failed to start NFS server
> locking daemons',
> last-rc-change='Mon Nov 6 22:47:25 2017', queued=0ms, exec=202ms
>
>
>
> So, even though I have all my constraints set up to bring everything
> up on the DRBD master, it seems to still insist on trying to start
> NFS Server on the slave...
>
> Here are my constraints:
>
> Location Constraints:
> Ordering Constraints:
> promote nfs-drbd-clone then start nfs-filesystem (kind:Mandatory)
> start nfs-filesystem then start nfs-ip (kind:Mandatory)
> start nfs-ip then start nfs-server (kind:Mandatory)
> start nfs-server then start nfs-notify (kind:Mandatory)
> start nfs-server then start nfs-root (kind:Mandatory)
> start nfs-server then start nfs-export1 (kind:Mandatory)
> Colocation Constraints:
> nfs-filesystem with nfs-drbd-clone (score:INFINITY) (with-rsc-
> role:Master)
> nfs-ip with nfs-filesystem (score:INFINITY)
> nfs-server with nfs-ip (score:INFINITY)
> nfs-root with nfs-filesystem (score:INFINITY)
> nfs-export1 with nfs-filesystem (score:INFINITY)
> nfs-notify with nfs-server (score:INFINITY)
>
>
> any ideas what I'm doing wrong here? Did I mess up my constraints?
>
> TIA
>
The constraints look good to me. To debug this sort of thing, I would
grab the pe-input file from the transition that tried to start it
wrongly, and use crm_simulate to get more information about it.
crm_simulate is not very user-friendly, so if you can attach the pe-
input file, I can take a look at it. (The pe-input will be listed at
the end of the transition in the logs on the node that was DC at the
time; you'll see a bunch of "pengine:" messages including one that the
resource was scheduled for a start on that particular node.)
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list