[Pacemaker] cluster doesn't failover - log at the end of msg
Luke Bigum
lbigum at iseek.com.au
Tue Nov 3 17:54:06 EST 2009
Hi Thomas,
You need to use a location constraint on your NFS resource and a ping/pingd resource to monitor network connectivity. Combined together you can make NFS or your DRBD Master resource constrained to nodes that have network connectivity. See http://www.clusterlabs.org/wiki/Example_configurations#Set_up_pingd.
Luke Bigum
Systems Administrator
(p) 1300 661 668
(f) 1300 661 540
(e) lbigum at iseek.com.au
http://www.iseek.com.au
Level 1, 100 Ipswich Road Woolloongabba QLD 4102
This e-mail and any files transmitted with it may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorised to receive for the recipient), please contact the sender by reply e-mail and delete all copies of this message.
-----Original Message-----
From: Thomas Schneider [mailto:thomas.schneider at euskill.com]
Sent: Wednesday 4 November 2009 8:28 AM
To: pacemaker at oss.clusterlabs.org
Subject: [Pacemaker] cluster doesn't failover - log at the end of msg
Hi
Thanks for your help i solve the problem by replacing this line
location cli-standby-nfs nfs \
rule $id="cli-standby-rule-nfs" -inf: #uname eq
storage02.myriapulse.local
by
location ms_drbd_nfs-master-storage01.myriapulse.local ms_drbd_nfs rule
role=master 100: \#uname storage01.myriapulse.local
but i have an other problem because i have 2 network card
Eth1: replication link
-------------- ----------------
| Server 1 |------| Server 2 |
-------------- ----------------
| |
Eth0 eth0
| |
External Network
The data of the nfs server is accessed via eth0, but if y shutdown eth0 on
the node witch is running the resource, the resource doesn't migrate and is
unreachable, i think it's because of the replication link (who stay up)
node storage01.myriapulse.local \
attributes standby="off"
node storage02.myriapulse.local \
attributes standby="off"
primitive drbd_nfs ocf:linbit:drbd \
params drbd_resource="nfs" \
op monitor interval="15s" \
meta target-role="Started"
primitive fs_nfs ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/nfs" directory="/share"
fstype="ext3" \
meta is-managed="true"
primitive ftp-server lsb:proftpd \
op monitor interval="1min"
primitive ip_nfs ocf:heartbeat:IPaddr2 \
params ip="10.1.1.69" nic="eth0"
primitive nfs-kernel-server lsb:nfs-kernel-server \
op monitor interval="1min"
group nfs fs_nfs ip_nfs nfs-kernel-server \
meta target-role="Started"
ms ms_drbd_nfs drbd_nfs \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
location drbd-fence-by-handler-ms_drbd_nfs ms_drbd_nfs \
rule $id="drbd-fence-by-handler-rule-ms_drbd_nfs" $role="Master"
-inf: #uname ne storage02.myriapulse.local
location ms_drbd_nfs-master-storage01.myriapulse.local ms_drbd_nfs \
rule $id="ms_drbd_nfs-master-storage01.myriapulse.local-rule"
$role="master" 100: #uname eq storage01.myriapulse.local
colocation ftp_on_nfs inf: ftp-server nfs
colocation nfs_on_drbd inf: nfs ms_drbd_nfs:Master
order ftp_after_nfs inf: nfs ftp-server
order nfs_after_drbd inf: ms_drbd_nfs:promote nfs:start
property $id="cib-bootstrap-options" \
dc-version="1.0.5-unknown" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
stonith-enabled="false"
Thanks for your help
Thomas Schneider
-----Message d'origine-----
De : drbd-user-bounces at lists.linbit.com
[mailto:drbd-user-bounces at lists.linbit.com] De la part de Lars Ellenberg
Envoyé : samedi 31 octobre 2009 10:15
À : drbd-user at lists.linbit.com
Objet : Re: [DRBD-user] cluster doesn't failover - log at the end of msg
On Sat, Oct 31, 2009 at 02:57:37AM +0100, Thomas Schneider wrote:
> Hello,
>
> I'm trying to setup a cluster for shared storage with pacemaker, drbd, and
> nfs.
> I use two server and there is two network interface one each server, eth0
> connected to the network, and eth1 direct link betwene the two node for
drbd
> replication,
> I can migrate the resource between the to node, but the problem is when i
> make a hard power-off of server01 ( the server where is running the
> resource) the second server (server02) doesn't make failover. (the
resource
> doesn't start) May you can take a look to my config file in the following
> storage01:~# crm configure show
> node storage01.myriapulse.local \
> attributes standby="off"
> node storage02.myriapulse.local \
> attributes standby="off"
> primitive drbd_nfs ocf:linbit:drbd \
> params drbd_resource="nfs" \
> op monitor interval="15s" \
> meta target-role="Started"
> primitive fs_nfs ocf:heartbeat:Filesystem \
> params device="/dev/drbd/by-res/nfs" directory="/share"
> fstype="ext3" \
> meta is-managed="true"
> primitive ftp-server lsb:proftpd \
> op monitor interval="1min"
> primitive ip_nfs ocf:heartbeat:IPaddr2 \
> params ip="10.1.1.69" nic="eth0"
> primitive nfs-kernel-server lsb:nfs-kernel-server \
> op monitor interval="1min"
> group nfs fs_nfs ip_nfs nfs-kernel-server \
> meta target-role="Started"
> ms ms_drbd_nfs drbd_nfs \
> meta master-max="1" master-node-max="1" clone-max="2"
> clone-node-max="1" notify="true"
> location cli-standby-nfs nfs \
> rule $id="cli-standby-rule-nfs" -inf: #uname eq
> storage02.myriapulse.local colocation ftp_on_nfs inf: ftp-server nfs
> colocation nfs_on_drbd inf: nfs ms_drbd_nfs:Master order ftp_after_nfs
inf:
> nfs ftp-server order nfs_after_drbd inf: ms_drbd_nfs:promote nfs:start
You really should get your line breaks right!
> property $id="cib-bootstrap-options" \
> dc-version="1.0.5-unknown" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> no-quorum-policy="ignore" \
> stonith-enabled="false"
> Oct 31 02:47:06 storage02 kernel: [419458.084631] block drbd0:
helpercommand: /sbin/drbdadm fence-peer minor-0
> Oct 31 02:47:07 storage02 crm-fence-peer.sh[24594]: invoked for nfs
> Oct 31 02:47:07 storage02 kernel: [419459.283186] block drbd0:
helpercommand: /sbin/drbdadm fence-peer minor-0 exit code 4 (0x400)
> Oct 31 02:47:07 storage02 kernel: [419459.283186] block drbd0:
fence-peerhelper returned 4 (peer was fenced)
> Oct 31 02:47:07 storage02 kernel: [419459.283186] block drbd0:
role(Secondary -> Primary ) pdsk( DUnknown -> Outdated )
> Oct 31 02:47:07 storage02 kernel: [419459.343187] block drbd0: Creating
newcurrent UUID
> Oct 31 02:47:07 storage02 lrmd: [23822]: info: RA
output:(drbd_nfs:0:promote:stdout)
> Oct 31 02:47:08 storage02 crmd: [23825]: info: process_lrm_event:
LRMoperation drbd_nfs:0_promote_0 (call=12, rc=0,
cib-update=42,confirmed=true) complete ok
DRBD is promoted just fine.
The resources using it are not.
Which is expected:
As long as you explicitly forbid "nfs" to run on storage02,
you should not complain about nfs not being started on storage02.
Hint:
> location cli-standby-nfs nfs \
> rule $id="cli-standby-rule-nfs" -inf: #uname eq
storage02.myriapulse.local
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user
_______________________________________________
Pacemaker mailing list
Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
More information about the Pacemaker
mailing list