[Pacemaker] Help on setting order of resources
Adrian Gibanel
adrian.gibanel at btactic.com
Sat Aug 18 20:16:07 UTC 2012
Short description
-----------------------
Corosync ignores my resources order settings.
Final goal
-----------
Being able to HA zimbra.
Description of the system
-----------------------------------
This is an Ubuntu 10.04 LTS because current stable Zimbra works in Ubuntu 10.04 and not yet in 12.04.
I've dist-upgraded packages from: https://launchpad.net/~ubuntu-ha-maintainers/+archive/ppa as it was advised on some sites.
My main configuration is based on this document: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
I've created some OCF resource agents (for zimbra and some network stuff) on my own and I've already tested them thanks to ocf-tester and ocf-tester-py (a hack of mine of ocf-tester that allows you to test python based ocf scripts).
Finally some packages versions:
libcrmcluster1 1.1.6-2ubuntu0~ppa2
libcrmcommon2 1.1.6-2ubuntu0~ppa2
corosync 1.4.2-1ubuntu0~ppa1
libcorosync4 1.4.2-1ubuntu0~ppa1
lvm2 2.02.54-1ubuntu4.1ppa5
pacemaker 1.1.6-2ubuntu0~ppa2
libglib2.0-0 2.24.1-0ubuntu1.1~ppa1
lvm2 2.02.54-1ubuntu4.1ppa5
cluster-glue 1.0.8-2ubuntu0~ppa4
libcluster-glue 1.0.8-2ubuntu0~ppa4
resource-agents 1:3.9.2-4ubuntu0~ppa2
crm configure show output:
-----------------------------------
adrian at zhatest-01:~$ sudo crm configure show
node zhatest-01.domain.com
node zhatest-02.domain.com
primitive ClusterDefaultRoute ocf:btactic:OVHdefaultroute \
op monitor interval="30s"
primitive ClusterHostRoute ocf:btactic:OVHhostroute \
params device="eth0" \
op monitor interval="30s"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params nic="eth0" ip="1.2.3.4" cidr_netmask="32" broadcast="1.2.3.4" \
op monitor interval="30s"
primitive ClusterOVHFailover ocf:btactic:OVHfailover \
op monitor interval="120s" timeout="60s" \
op start interval="0" timeout="660" \
op stop interval="0" timeout="660" \
params nichandle="MYLOGIN" password="MYSECRET" failover="1.2.3.4" \
meta target-role="Started"
primitive ZimbraData ocf:linbit:drbd \
params drbd_resource="zimbradata" \
op monitor interval="60s" role="Master" \
op monitor interval="50s" role="Slave" \
op start interval="0" role="Master" timeout="240" \
op start interval="0" role="Slave" timeout="240" \
op stop interval="0" role="Master" timeout="100" \
op stop interval="0" role="Slave" timeout="100"
primitive ZimbraFS ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/zimbradata" directory="/opt/zimbra" fstype="ext4" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="60s"
primitive ZimbraServer ocf:btactic:zimbra \
op monitor interval="2min" \
op start interval="0" timeout="360s" \
op stop interval="0" timeout="360s"
group MySystem ClusterOVHFailover ClusterIP ClusterHostRoute ClusterDefaultRoute
group MyZimbra ZimbraFS ZimbraServer
ms ZimbraDataClone ZimbraData \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
location prefer-zhatest-01 MyZimbra 50: zhatest-01.domain.com
colocation everything-together inf: MySystem ZimbraDataClone:Master MyZimbra
order everything-ordered inf: MySystem ZimbraDataClone:promote MyZimbra
property $id="cib-bootstrap-options" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="openais" \
expected-quorum-votes="2"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
crm_on -orVVVV1 output:
----------------------------------
crm_mon[4215]: 2012/08/18_19:46:39 info: main: Starting crm_mon
crm_mon[4215]: 2012/08/18_19:46:39 info: unpack_config: Startup probes: enabled
crm_mon[4215]: 2012/08/18_19:46:39 notice: unpack_config: On loss of CCM Quorum: Ignore
crm_mon[4215]: 2012/08/18_19:46:39 info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
crm_mon[4215]: 2012/08/18_19:46:39 info: unpack_domains: Unpacking domains
crm_mon[4215]: 2012/08/18_19:46:39 info: determine_online_status: Node zhatest-01.domain.com is online
crm_mon[4215]: 2012/08/18_19:46:39 notice: unpack_rsc_op: Hard error - ZimbraServer_last_failure_0 failed with rc=5: Preventing ZimbraServer from re-starting on zhatest-01.domain.com
============
Last updated: Sat Aug 18 19:46:39 2012
Last change: Sat Aug 18 18:09:51 2012 via crmd on zhatest-01.domain.com
Stack: openais
Current DC: zhatest-01.domain.com - partition WITHOUT quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
8 Resources configured.
============
Online: [ zhatest-01.domain.com ]
OFFLINE: [ zhatest-02.domain.com ]
Full list of resources:
Resource Group: MySystem
ClusterOVHFailover (ocf::btactic:OVHfailover): Stopped
ClusterIP (ocf::heartbeat:IPaddr2): Stopped
ClusterHostRoute (ocf::btactic:OVHhostroute): Stopped
ClusterDefaultRoute (ocf::btactic:OVHdefaultroute): Stopped
Resource Group: MyZimbra
ZimbraFS (ocf::heartbeat:Filesystem): Stopped
ZimbraServer (ocf::btactic:zimbra): Stopped
Master/Slave Set: ZimbraDataClone [ZimbraData]
Slaves: [ zhatest-01.domain.com ]
Stopped: [ ZimbraData:1 ]
Operations:
* Node zhatest-01.domain.com:
ZimbraData:0: migration-threshold=1000000
+ (9) start: rc=0 (ok)
+ (11) monitor: interval=50000ms rc=0 (ok)
ZimbraServer: migration-threshold=1000000
+ (7) probe: rc=5 (not installed)
Failed actions:
ZimbraServer_monitor_0 (node=zhatest-01.domain.com, call=7, rc=5, status=complete): not installed
Long description:
-----------------------
I expect that system tries to start resources in the following order:
MySystem ZimbraDataClone:Master MyZimbra
that after expanding group members is:
ClusterOVHFailover ClusterIP ClusterHostRoute \
ClusterDefaultRoute ZimbraDataClone:Master \
ZimbraFS ZimbraServer
.
If crm_mon -o shows the operation history as per my former log it seems that corosync insists on starting ZimbraData on the first place and I don't want that.
So, that's it. Am I missing something? If you need more logs don't hesitate to ask for them.
Thank you!
Other questions
---------------------
Where is documented the probe operation which happens to appear on crm_mon output?
P.S.: This unanswered email is very similar to my issue: http://lists.linux-ha.org/pipermail/linux-ha/2011-May/043144.html
--
--
Adrián Gibanel
I.T. Manager
+34 675 683 301
www.btactic.com
Ens podeu seguir a/Nos podeis seguir en:
i
Abans d´imprimir aquest missatge, pensa en el medi ambient. El medi ambient és cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos.
AVIS:
El contingut d'aquest missatge i els seus annexos és confidencial. Si no en sou el destinatari, us fem saber que està prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autorització corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge .
AVISO:
El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que está prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorización correspondiente. Si han recibido este mensaje por error, les agradeceríamos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje .
More information about the Pacemaker
mailing list