[Pacemaker] DRBD-MYSQL Pacemaker-Corosync won't failover when heartbeat cable pulled.

Fri Oct 28 15:20:20 UTC 2011

I am so sorry for the confusion. I was using heartbeat and could not get 
it to work so I switch over to corosync.
Yes, I do have two NIC and two network cables. One crossover cable 
connected to eth0 on both nodes for DRBD replication. Other cable 
connect to the LAN for heartbeat.  thank you very much for your help.

  Node1(192.168.1.140-eth1) --> connect to LAN <---Node2(192.168.1.150-eth1)
Node1 (10.0.0.10-eth0-DRBD)<--cross over 
cable---->Node2(10.0.0.20-eth0-DRBD)
Virtual Cluster IP : 192.168.1.160

_Here is my corosync and pacemaker config:
  crm configure show_
node mysqldrbd01 \
         attributes standby="off"
node mysqldrbd02 \
         attributes standby="off"
primitive res_Filesystem_QD-FS ocf:heartbeat:Filesystem \
         params device="/dev/drbd0" directory="/data/" fstype="ext3" \
         operations $id="res_Filesystem_QD-FS-operations" \
         op start interval="0" timeout="60" \
         op stop interval="0" timeout="60" \
         op monitor interval="20" timeout="40" start-delay="0" \
         op notify interval="0" timeout="60" \
         meta target-role="started"
primitive res_IPaddr2_QD-IP ocf:heartbeat:IPaddr2 \
         params ip="192.168.1.160" nic="eth1" \
         operations $id="res_IPaddr2_QD-IP-operations" \
         op start interval="0" timeout="20" \
         op stop interval="0" timeout="20" \
         op monitor interval="10" timeout="20" start-delay="0" \
         meta target-role="started"
primitive res_drbd_1 ocf:linbit:drbd \
         params drbd_resource="QD-RES" \
         operations $id="res_drbd_1-operations" \
         op start interval="0" timeout="240" \
         op promote interval="0" timeout="90" \
         op demote interval="0" timeout="90" \
         op stop interval="0" timeout="100" \
         op monitor interval="10" timeout="20" start-delay="1min" \
         op notify interval="0" timeout="90" \
         meta target-role="started"
primitive res_mysqld_QD-MYSQL-SRV lsb:mysqld \
         operations $id="res_mysqld_QD-MYSQL-SRV-operations" \
         op start interval="0" timeout="15" \
         op stop interval="0" timeout="15" \
         op monitor interval="15" timeout="15" start-delay="15"
ms ms_drbd_1 res_drbd_1 \
         meta clone-max="2" notify="true"
colocation IP-FS inf: res_IPaddr2_QD-IP res_Filesystem_QD-FS
colocation col_res_Filesystem_QD-FS_ms_drbd_1 inf: res_Filesystem_QD-FS 
ms_drbd_1:Master
order ord_ms_drbd_1_res_Filesystem_QD-FS inf: ms_drbd_1:promote 
res_Filesystem_QD-FS:start
order ord_res_IPaddr2_QD-IP_res_Filesystem_QD-FS inf: 
res_Filesystem_QD-FS res_IPaddr2_QD-IP:start
property $id="cib-bootstrap-options" \
         expected-quorum-votes="2" \
         stonith-enabled="false" \
         dc-version="1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87" \
         no-quorum-policy="ignore" \
         cluster-infrastructure="openais" \
         last-lrm-refresh="1319814162"
rsc_defaults $id="rsc-options" \
         resource-stickiness="100"

*_/etc/corosync/corosync.conf_*

aisexec {
         user: root
         group: root
}

corosync {
         user: root
         group: root
}

amf {
         mode: disabled
}

logging {
         to_stderr: yes
         debug: off
         timestamp: on
         to_file: no
         to_syslog: yes
         syslog_facility: daemon
}

totem {
         version: 2
         token: 3000
         token_retransmits_before_loss_const: 10
         join: 60
         consensus: 4000
         vsftype: none
         max_messages: 20
         clear_node_high_bit: yes
         secauth: on
         threads: 0
         # nodeid: 1234
         rrp_mode: active

#       interface {
#               ringnumber: 0
#               bindnetaddr: 10.0.0.0
#               mcastaddr: 226.94.1.1
#               mcastport: 5405
#       }

         interface {
                 ringnumber: 0
                 bindnetaddr: 192.168.1.0
                 mcastaddr: 226.94.1.1
                 mcastport: 5405
         }
}

service {
         ver: 0
         name: pacemaker
         use_mgmtd: no
}

On 10/28/2011 9:28 AM, Andreas Kurz wrote:
> Hello,
>
> On 10/28/2011 04:06 PM, Joe wrote:
>> Hello everyone,
>>
>> My goal is to build a HA DRBD and MYSQL on two nodes(active/passive).  I
>> followed the " cluster from scratch" article to build this environment.
>> If I do standby failover host, it works fine but when I pull the
>> heartbeat cable from the active node, the resources do not fail over to
>> the secondary. Please advise. Thank you very much. Joe
> * "the" heartbeat cable == one heartbeat cable? Not supported setup --
> use at least two heartbeat channels, add at least the DRBD replication
> network.
>
> * you produced a split brain situation. I'd expect you see errors in
> your cluster status and see DRBD log entries with a wording similar to
> "not allowed to become primary because allow-two-primaries is not set"
> ... if you had also pulled DRBD link you would have all resources
> running twice
>
> * use stonith to recover from a split-brain situation (or block if its
> unsuccessful)
>
> * read more manuals ;-)
>
>> *CENTOS 5.6/drbd8.3/corosync/pacemaker
>>
>> node-0 IP: 192.168.1.101 (heartbeat) 10.0.0.10 (drdb)
>> node-1 IP: 192.168.1.102 (heartbeat) 10.0.0.20 ( drbd)
>> Cluster Virtual IP: 192.168.1.160
>>
>> * _crm configure show_
>> node $id="2b68511d-b96f-4b56-9f66-70262e3e2c46" mysqldrbd01 \
>>      attributes standby="off"
>> node $id="d86dc58b-2309-43d9-af96-6519127e83d7" mysqldrbd02 \
>>      attributes standby="off"
> These ids are from Heartbeat CCM ... but you attached a Corosync
> configuration? Decide for one ... either Heartbeat or Corosync.
>
> Regards,
> Andreas
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20111028/7486fbd8/attachment.htm>