[Pacemaker] Rsource failover error
cfk at itri.org.tw
cfk at itri.org.tw
Fri Mar 18 07:37:06 UTC 2011
Dear all,
I am a new member to this mailing list. Please let me know if the explanation is not clear enough.
I setup a Centos 5.4 cluster environment (2 nodes, alpha1 and alpha2) with the following software:
Corosync 1.3.0
Pacemaker 1.0.10.
Drbd 8.3.9
The environment is constructed as Active/Passive cluster mode based on http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf.
I setup four resources ( IP, DRBD, FileSystem, Apache) and want to test different failover situations.
When I kill the corosync process at Active host, the Pacemaker seems fail to move DRBD:Master to the original Passive host, said Alpha2.
Corosync and DRBD configuration files are attached in this mail, and the crm configuration is listed below
=====================================================================================
node alpha1
node alpha2
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.75.10" cidr_netmask="32" \
op monitor interval="10s"
primitive Disk ocf:linbit:drbd \
params drbd_resource="ccmadata" \
op monitor interval="60s"
primitive FS ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/var/www/html" fstype="
ext3"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min"
ms DiskClone Disk \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
colocation drbd-with-ip inf: ClusterIP DiskClone:Master
colocation fs-on-drbd inf: FS DiskClone:Master
colocation website-with-fs inf: WebSite FS
order DiskClone-after-IP inf: DiskClone:promote ClusterIP:start
order FS-after-DiskClone inf: DiskClone:promote FS:start
order WebSite-after-FS inf: FS:start WebSite:start
property $id="cib-bootstrap-options" \
dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
=====================================================================================
The first abnormal monitoring message by crm_mon command is
=====================================================================================
Last updated: Thu Mar 17 18:19:04 2011
Stack: openais
Current DC: alpha2 - partition WITHOUT quorum
Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ alpha2 ]
OFFLINE: [ alpha1 ]
Master/Slave Set: DiskClone
Slaves: [ alpha2 ]
Stopped: [ Disk:0 ]
=====================================================================================
The last abnormal monitoring message is
=====================================================================================
============
Last updated: Thu Mar 17 18:20:01 2011
Stack: openais
Current DC: alpha2 - partition WITHOUT quorum
Version: 1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ alpha2 ]
OFFLINE: [ alpha1 ]
Master/Slave Set: DiskClone
Slaves: [ alpha2 ]
Stopped: [ Disk:1 ]
Failed actions:
Disk:1_promote_0 (node=alpha2, call=12, rc=-2, status=Timed Out):
unknown ex
ec error
Disk:0_promote_0 (node=alpha2, call=22, rc=-2, status=Timed Out):
unknown ex
ec error
=====================================================================================
Corosync log on host Alpha1 is drbd_test_alpha1.log, and that on hoat Alpha2 is drbd_test_alpha2.log
My questions are:
1) How to solve this issue? Do I miss some crm configuration for this situation?
2) According to corosync log on host Alpha2, Pacemaker wants to prompt 2 DRBD masters (Please correct me if I am wrong). The action is failed because the operation mode is set as Active/Passive mode and only 1 DRBD master is allowed to exist. Should I add additional crm or drbd.conf configurations?
3) I am still study STONITH. Is my question a split-brain issue?
Thanks for your help.
BR,
Chia-Feng Kang
====================================================================
本信件可能包含工研院機密資訊,非指定之收件者,請勿使用或揭露本信件內容,並請銷毀此信件。
This email may contain confidential information. Please do not use or disclose it in any way and delete it if you are not the intended recipient.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110318/b2075a65/attachment-0003.html>
More information about the Pacemaker
mailing list