[Pacemaker] "Simple" LVM/drbd backed Primary/Secondary NFS cluster doesn't always failover cleanly

Tue Oct 23 15:29:44 UTC 2012

----- Original Message -----
 > From: Andreas Kurz <andreas at hastexo.com>
 > Date: Sun, 21 Oct 2012 01:38:46 +0200
 > Subject: Re: [Pacemaker] "Simple" LVM/drbd backed Primary/Secondary 
NFS cluster doesn't always failover cleanly
 > To: pacemaker at oss.clusterlabs.org
 >
 >
> On 10/18/2012 08:02 PM, Justin Pasher wrote:
>> I have a pretty basic setup by most people's standards, but there must
>> be something that is not quite right about it. Sometimes when I force a
>> resource failover from one server to the other, the clients with the NFS
>> mounts don't cleanly migrate to the new server. I configured this using
>> a few different "Pacemaker-DRBD-NFS" guides out there for reference (I
>> believe they were the Linbit guides).
> Are you using the latest "exportfs" resource-agent from github-repo? ...
> there have been bugfixes/improvements... and try to move the VIP for
> each export to the end of its group so the IP where the clients connect
> is started at the last/stopped at the first position.
>
> Regards,
> Andreas

I'm current running the version that comes with the Debian 
squeeze-backports resource-agents package (1:3.9.2-5~bpo60+1). I went 
ahead and grabbed a copy of exportfs from the git repository. It's a 
little risky for me to update the file right now, since the two 
resources I am worried about the most are the NFS shares for the 
XenServer VDIs, so when it has a hiccup in the connection to the NFS 
server, things start exploding (e.g. guest VMs start having disk errors 
and go read-only).

I scanned through the changes real quick and the biggest change I 
noticed was how the .rmtab file backup is restored (it sorts and filters 
unique entries instead of just concatenating the results to the end of 
/var/lib/nfs/rmtab). I had actually tweaked that a little bit myself 
before when I was trying to trace down the problem.

Ultimately I think my problem is more related to the NFS server itself 
and how it handles "unknown" client connections after a failover. I've 
see people here and there mention that /var/lib/nfs should be on the 
replicated device to maintain consistency after fail over, but the 
exportfs resource agent doesn't do anything like that. Is that not 
actually needed anymore? At any rate, in my situation, the problem is 
that I am maintaining four independent NFS shares and each one can be 
failed over separately (and running on either server at any time), so a 
simple copy of the directory won't work since there is no "master" 
server at any given time.

Also, I did find a bug in the way backup_rmtab() filters the export list 
for its backup. Since it looks for a leading AND trailing colon (:), it 
doesn't properly copy information about mounts that pulled from 
subdirectories under the NFS mount (e.g. instead of mounting /home, a 
server might mount /home/username such as with autofs, which won't get 
copied to the .rmtab backup). I'll file a bug report about that.

Thanks.

-- 
Justin Pasher