[Pacemaker] failover problem with pacemaker & drbd

Sat Aug 15 11:51:46 UTC 2009

I noticed that you are using a non-cluster file system, ext3, so you should be using a master slave resource, not a simple resource for drbd. (unless you seem to be starting drbd with the system init scripts that may not be the best thing to do). 

Please look at my previous post to the list "Master/Slave resource cannot start" which has a working configuration for drbd using two drbd resurces, nfs and samba with pingd. 

http://oss.clusterlabs.org/pipermail/pacemaker/2009-August/002339.html 

Please note that I am using drbd-8.3.2 which has a new resource script included under linbit:drbd 

The drbd documentation has a decent example which is only missing the pingd part of the configuration 

http://www.drbd.org/users-guide-emb/s-pacemaker-config.html 

HTH, 

Diego 

----- "Gerry kernan" <gerry.kernan at infinityit.ie> wrote: 
> 
> 

Hi 

I have setup 2 servers so that I can replicate a filesystem between both servers using drbd. I configured a drbd , filesystem, IPaddress, pingd resources, I also have an lsb resource to start icobol. 

I can stop & start the resource group & migrate the resource group between servers using pacemaker GUI. But if I power down or take one of the servers of the network the resource group doesn’t fail over to the other node. 

Hopefully someone can point out to me where I have make a mistake or not configured sometime. 

. Pacemaker config, drbd.conf & openais.conf are below 

node host1.localdomain 

node host2.localdomain \ 

attributes standby="false" 

primitive res_drbd_credit heartbeat:drbddisk \ 

operations $id="res_drbd_credit-operations" \ 

op monitor interval="15" timeout="15" start-delay="15" \ 

params 1="credit" \ 

meta $id="res_drbd_credit-meta_attributes" 

primitive res_filesystem_credit ocf:heartbeat:Filesystem \ 

meta $id="res_filesystem_credit-meta_attributes" \ 

operations $id="res_filesystem_credit-operations" \ 

op monitor interval="20" timeout="40" start-delay="10" \ 

params device="/dev/drbd0" directory="/credit" fstype="ext3" 

primitive res_icobol_credit lsb:icobol \ 

meta is-managed="true" \ 

operations $id="res_icobol_credit-operations" \ 

op monitor interval="15" timeout="15" start-delay="15" 

primitive res_ip_credit ocf:heartbeat:IPaddr2 \ 

meta $id="res_ip_credit-meta_attributes" \ 

operations $id="res_ip_credit-operations" \ 

op monitor interval="10s" timeout="20s" start-delay="5s" \ 

params ip="192.168.200.1" cidr_netmask="255.255.255.0" 

primitive res_pingd ocf:pacemaker:pingd \ 

operations $id="res_pingd-operations" \ 

op monitor interval="10" timeout="20" start-delay="1m" \ 

params host_list="192.168.200.7" 

group grp_credit res_drbd_credit res_filesystem_credit res_ip_credit res_icobol_credit res_pingd \ 

meta target-role="started" 

location cli-prefer-grp_credit grp_credit \ 

rule $id="cli-prefer-rule-grp_credit" inf: #uname eq host2.localdomain 

location cli-prefer-res_icobol_credit res_icobol_credit \ 

rule $id="cli-prefer-rule-res_icobol_credit" inf: #uname eq host1.localdomain 

location cli-standby-grp_credit grp_credit \ 

rule $id="cli-standby-rule-grp_credit" -inf: #uname eq host1.localdomain 

colocation loc_grp_credit inf: res_filesystem_credit res_drbd_credit 

colocation loc_icobol inf: res_icobol_credit res_ip_credit 

colocation loc_ip inf: res_ip_credit res_filesystem_credit 

property $id="cib-bootstrap-options" \ 

dc-version="1.0.4-6dede86d6105786af3a5321ccf66b44b6914f0aa" \ 

cluster-infrastructure="openais" \ 

expected-quorum-votes="2" \ 

last-lrm-refresh="1250158583" \ 

node-health-red="0" \ 

stonith-enabled="false" \ 

default-resource-stickiness="200" \ 

no-quorum-policy="ignore" \ 

stonith-action="poweroff" 

[root at host1 ~]# cat /etc/drbd.conf 

# 

# please have a a look at the example configuration file in 

# /usr/share/doc/packages/drbd/drbd.conf 

# 

global { 

usage-count yes; 

} 

common { 

protocol C; 

} 

resource credit { 

device /dev/drbd0; 

meta-disk internal; 

disk /dev/cciss/c0d0p5; 

on host1.localdomain { 

address 10.100.100.1:7789; 

} 

on host2.localdomain { 

address 10.100.100.2:7789; 

} 

handlers { 

split-brain "/usr/lib/drbd/notify-split-brain.sh root"; 

} 

} 

# Please read the openais.conf.5 manual page 

aisexec { 

# Run as root - this is necessary to be able to manage resources with Pa 

cemaker 

user: root 

group: root 

} 

service { 

# Load the Pacemaker Cluster Resource Manager 

ver: 0 

name: pacemaker 

use_mgmtd: yes 

use_logd: yes 

} 

totem { 

version: 2 

# How long before declaring a token lost (ms) 

token: 5000 

# How many token retransmits before forming a new configuration 

token_retransmits_before_loss_const: 10 

# How long to wait for join messages in the membership protocol (ms) 

join: 1000 

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms) 

consensus: 2500 

# Turn off the virtual synchrony filter 

vsftype: none 

# Number of messages that may be sent by one processor on receipt of thetoken 

max_messages: 20 

# Stagger sending the node join messages by 1..send_join ms 

send_join: 45 

# Limit generated nodeids to 31-bits (positive signed integers) 

clear_node_high_bit: yes 

# Disable encryption 

secauth: on 

# How many threads to use for encryption/decryption 

threads: 0 

# Optionally assign a fixed node id (integer) 

# nodeid: 1234 

rrp_mode: active 

interface { 

ringnumber: 0 

bindnetaddr: 192.168.200.0 

mcastaddr: 239.0.0.42 

mcastport: 5405 

} 

interface { 

ringnumber: 1 

bindnetaddr: 10.100.100.0 

mcastaddr: 239.0.0.43 

mcastport: 5405 

} 

} 

logging { 

debug: on 

fileline: off 

to_syslog: yes 

to_stderr: off 

syslog_facility: daemon 

timestamp: on 

} 

amf { 

mode: disabled 

} 

Best regards, 

Gerry kernan 

Infinity Integration technology 

Suite 17 The mall Beacon Court 

Sandyford 

Dublin 18 

www.infinityit.ie 

P. +35312930090 

F. +35312930137 

menu_r1_c1

> _______________________________________________ Pacemaker mailing list Pacemaker at oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker 

-- 
Diego Julian Remolina 
System Administrator - Systems Support Specialist IV 
School of Physics 
Georgia Institute of Technology 
Phone: (404) 385-3499 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20090815/04cc20e7/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.jpg
Type: image/jpeg
Size: 4478 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20090815/04cc20e7/attachment-0004.jpg>