[Pacemaker] Fail-over NFS Server (need cluster configuration check) [SOLVED]
Matteo Guglielmi
matteo.guglielmi at epfl.ch
Tue Feb 28 19:15:06 UTC 2012
I's always fun replying to our own emails :-)
Solution to point (1): "Fully Sequential MS Promotion"
primitive p_drbd_home ocf:linbit:drbd \
params drbd_resource="home" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
primitive p_drbd_software ocf:linbit:drbd \
params drbd_resource="software" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
primitive p_drbd_srv ocf:linbit:drbd \
params drbd_resource="srv" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
ms ms_drbd_home p_drbd_home \
meta master-max="1" master-node-max="1" \
clone-max="2" clone-node-max="1" notify="true"
ms ms_drbd_software p_drbd_software \
meta master-max="1" master-node-max="1" \
clone-max="2" clone-node-max="1" notify="true"
ms ms_drbd_srv p_drbd_srv \
meta master-max="1" master-node-max="1" \
clone-max="2" clone-node-max="1" notify="true"
colocation co_ms_drbd_software_with_ms_drbd_srv inf: \
ms_drbd_software:Master ms_drbd_srv:Master
order o_ms_drbd_software_after_ms_drbd_srv_promote mandatory: \
ms_drbd_srv:promote ms_drbd_software:start
colocation co_ms_drbd_home_with_ms_drbd_software inf: \
ms_drbd_home:Master ms_drbd_software:Master
order o_ms_drbd_home_after_ms_drbd_software_promote mandatory: \
ms_drbd_software:promote ms_drbd_home:start
Solution to point (2): "Fully Sequential FS Mounting + DHCP server (after MS Promotion)"
primitive p_fs_home ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/home" \
directory="/share/drbd/nfs/home" fstype="ext4" \
options="noatime,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
primitive p_fs_software ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/software" \
directory="/share/drbd/nfs/software" \
fstype="ext4" options="noatime" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
primitive p_fs_srv ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/srv" \
directory="/share/drbd/nfs/srv" \
fstype="ext4" options="noatime" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
primitive p_ip_nfs ocf:heartbeat:IPaddr2 \
params ip="192.168.0.50" cidr_netmask="24" iflabel="nfs" \
op monitor interval="20"
primitive p_service_isc-dhcp-server lsb:isc-dhcp-server \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
group g_service_fs_ip_dhcp p_fs_srv p_fs_software p_fs_home \
p_ip_nfs p_service_isc-dhcp-server
colocation co_ms_drbd_home_with_g_service_fs_ip_dhcp inf: \
g_service_fs_ip_dhcp ms_drbd_home:Master
order o_g_service_fs_ip_dhcp_after_ms_drbd_home_promote mandatory: \
ms_drbd_home:promote g_service_fs_ip_dhcp:start
Solution to point (3): "Fully Sequential Cloned NFS Exporting + QUOTA server (after FS Mounting + DHCP server)"
primitive p_service_nfs-common lsb:nfs-common \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
primitive p_service_nfs-kernel-server lsb:nfs-kernel-server \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
primitive p_service_quota lsb:quota \
op start interval="0" timeout="60" \
op stop interval="0" timeout="240" \
op monitor interval="20"
group g_service_nfs_quota p_service_nfs-common p_service_nfs-kernel-server \
p_service_quota
clone cl_g_service_nfs_quota g_service_nfs_quota
order o_cl_g_service_nfs_quota_after_p_service_isc-dhcp-server_start mandatory: \
g_service_fs_ip_dhcp:start cl_g_service_nfs_quota:start
Works like a charm,
--matt
On 02/27/12 03:32, Matteo Guglielmi wrote:
> On two machines (A and B) I've created three identical LVM
> partitions (DRBD backing device) called srv, home and software.
>
> The fs on all of them is ext4.
>
> The home fs has quotas.
>
> srv, home and software are exported via NFS.
>
> Both A and B do also have an extra locally mounted fs (data1 and
> data2 respectively) with quotas, data1 and data2 are exported via
> NFS too (NO DRBD backing device for them... they are just local
> file systems).
>
> Both A and B do have a dhcp server but only one dhcp server can
> be found running on the machine which have all three drbd fs
> in primary mode.
>
> A floating IP is used for mounting srv, software and home on all
> NFS clients.
>
> The cluster configuration I'd like to have should reproduce the
> following scenario:
>
>
> A: ( srv + home + software + IP + dhcp + nfsserver + quota-server)
> B: ( nfs-server + quota-server)
>
> or
>
> A: ( nfs-server + quota-server)
> A: ( srv + home + software + IP + dhcp + nfsserver )
>
>
> ### Cluster Configuration ###
>
> 1) All ms_drbd must be in primary mode on the same host:
>
> primitive p_drbd_home ocf:linbit:drbd \
> params drbd_resource="home" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> primitive p_drbd_software ocf:linbit:drbd \
> params drbd_resource="software" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> primitive p_drbd_srv ocf:linbit:drbd \
> params drbd_resource="srv" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> ms ms_drbd_home p_drbd_home \
> meta master-max="1" master-node-max="1" \
> clone-max="2" clone-node-max="1" notify="true"
> ms ms_drbd_software p_drbd_software \
> meta master-max="1" master-node-max="1" \
> clone-max="2" clone-node-max="1" notify="true"
> ms ms_drbd_srv p_drbd_srv \
> meta master-max="1" master-node-max="1" \
> clone-max="2" clone-node-max="1" notify="true"
> colocation co_ms_drbd_home_with_ms_drbd_srv_and_ms_drbd_software \
> inf: ms_drbd_home:Master ms_drbd_srv:Master ms_drbd_software:Master
>
> Questions:
>
> - is the "colocation" definition correct/enough?
>
> - how to enforce a sequence of events such as: promote software first,
> then if everything went ok promote srv, then if everything went ok
> promote home? (I would need this behavior because... see questions at
> the end of point 2)
>
> 2) Mounting srv, software and home fs + floating IP + dhcp server on the
> node hosting all drbd devices in primary mode:
>
> primitive p_fs_home ocf:heartbeat:Filesystem \
> params device="/dev/drbd/by-res/home" \
> directory="/share/drbd/nfs/home" fstype="ext4" \
> options="noatime,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> primitive p_fs_software ocf:heartbeat:Filesystem \
> params device="/dev/drbd/by-res/software" \
> directory="/share/drbd/nfs/software" fstype="ext4" \
> options="noatime" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> primitive p_fs_srv ocf:heartbeat:Filesystem \
> params device="/dev/drbd/by-res/srv" \
> directory="/share/drbd/nfs/srv" fstype="ext4" \
> options="noatime" \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> primitive p_ip_nfs ocf:heartbeat:IPaddr2 \
> params ip="192.168.0.50" cidr_netmask="24" iflabel="nfs" \
> op monitor interval="20"
> primitive p_service_isc-dhcp-server lsb:isc-dhcp-server \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> group g_service_fs_ip_dhcp p_fs_srv p_fs_software p_fs_home \
> p_ip_nfs p_service_isc-dhcp-server
> colocation co_ms_drbd_home_with_g_service_fs_ip_dhcp \
> inf: g_service_fs_ip_dhcp ms_drbd_home:Master
> order o_g_service_fs_ip_dhcp_after_ms_drbd_home_promote \
> inf: ms_drbd_home:promote g_service_fs_ip_dhcp:start
>
> Questions:
>
> - If I know that home is the last drbd device promoted into
> primary mode, then I'm ready to mount all fs, start the
> floating IP and dhcp server on the node where drbd home is
> in primary mode... are both colocation and order constraints
> correct?
>
> 3) nfs-server and quota-server must be started on both hosts
> once all filesystems are mouned:
>
> primitive p_service_nfs-common lsb:nfs-common \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> primitive p_service_nfs-kernel-server lsb:nfs-kernel-server \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> primitive p_service_quota lsb:quota \
> op start interval="0" timeout="60" \
> op stop interval="0" timeout="240" \
> op monitor interval="20"
> group g_service_nfs_quota p_service_nfs-common \
> p_service_nfs-kernel-server p_service_quota
> clone cl_g_service_nfs_quota g_service_nfs_quota
> order o_cl_g_service_nfs_quota_after_service_fs_ip_dhcp_start \
> inf: g_service_fs_ip_dhcp:start cl_g_service_nfs_quota
>
> Questions:
>
> - Here I'm really lost... and with this configuration my
> cluster do not act properly (many error messages) once I set
> in standby one of the two nodes.... do you see anything weired
> here?
>
> ###
>
> I can post the error messages but I'd first like to make sure that
> the cluster configuration is at least not that bad...
>
> Thanks to all.
>
> --matt
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> .
>
More information about the Pacemaker
mailing list