[Pacemaker] Question about cluster start-up in a 2 node cluster with a node offline.

James FLatten jflatten at iso-ne.com
Mon Feb 13 15:54:07 UTC 2012


On 02/12/2012 04:55 PM, Andreas Kurz wrote:
> 	op monitor role="Master" interval="30s"
> 	op monitor role="Slave"  interval="31s"
> ipmi fencing device capable of
> fencing more than one node?
Andreas-

I applied both changes you mentioned and the behavior still exists.  
Here is my current configuration:

    node nodea \
         attributes standby="off"
    node nodeb \
         attributes standby="off"
    primitive ClusterIP ocf:heartbeat:IPaddr2 \
         params ip="192.168.1.3" cidr_netmask="32" \
         op monitor interval="30s"
    primitive datafs ocf:heartbeat:Filesystem \
         params device="/dev/drbd0" directory="/data" fstype="ext3" \
         meta target-role="Started"
    primitive drbd0 ocf:linbit:drbd \
         params drbd_resource="drbd0" \
         op monitor interval="31s" role="Slave" \
         op monitor interval="30s" role="Master"
    primitive drbd1 ocf:linbit:drbd \
         params drbd_resource="drbd1" \
         op monitor interval="31s" role="Slave" \
         op monitor interval="30s" role="Master"
    primitive fence-nodea stonith:fence_ipmilan \
         params pcmk_host_list="nodeb" ipaddr="xxx.xxx.xxx.xxx"
    login="xxxxxxx" passwd="xxxxxxxx" lanplus="1" timeout="4" auth="md5" \
         op monitor interval="60s"
    primitive fence-nodeb stonith:fence_ipmilan \
         params pcmk_host_list="nodea" ipaddr="xxx.xxx.xxx.xxx"
    login="xxxxxxx" passwd="xxxxxxxx" lanplus="1" timeout="4" auth="md5" \
         op monitor interval="60s"
    primitive httpd ocf:heartbeat:apache \
         params configfile="/etc/httpd/conf/httpd.conf" \
         op monitor interval="1min"
    primitive patchfs ocf:heartbeat:Filesystem \
         params device="/dev/drbd1" directory="/patch" fstype="ext3" \
         meta target-role="Started"
    group web datafs patchfs ClusterIP httpd
    ms drbd0clone drbd0 \
         meta master-max="1" master-node-max="1" clone-max="2"
    clone-node-max="1" notify="true" target-role="Master"
    ms drbd1clone drbd1 \
         meta master-max="1" master-node-max="1" clone-max="2"
    clone-node-max="1" notify="true" target-role="Master"
    location fence-on-nodea fence-nodea \
         rule $id="fence-on-nodea-rule" -inf: #uname ne nodea
    location fence-on-nodeb fence-nodeb \
         rule $id="fence-on-nodeb-rule" -inf: #uname ne nodeb
    colocation datafs-with-drbd0 inf: web drbd0clone:Master
    colocation patchfs-with-drbd1 inf: web drbd1clone:Master
    order datafs-after-drbd0 inf: drbd0clone:promote web:start
    order patchfs-after-drbd1 inf: drbd1clone:promote web:start
    property $id="cib-bootstrap-options" \
         dc-version="1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558" \
         cluster-infrastructure="openais" \
         expected-quorum-votes="2" \
         stonith-enabled="false" \
         no-quorum-policy="ignore" \
         last-lrm-refresh="1328556424"
    rsc_defaults $id="rsc-options" \
         resource-stickiness="100"

If the cluster is fully down, I start corosync and pacemaker on one 
node, the cluster fences the other node but the services do not come up 
until the cluster-recheck-interval occurs.  I have attached the 
corosync.log from this latest test.

-Davin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120213/1796ed8e/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: corosync.log.problem.1
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120213/1796ed8e/attachment-0004.ksh>


More information about the Pacemaker mailing list