[Pacemaker] Could not connect to the CIB: Remote node did not respond
Liang.Ma at asc-csa.gc.ca
Liang.Ma at asc-csa.gc.ca
Wed Feb 9 15:09:11 UTC 2011
Forgot mentioning that the pair of nodes work before. And I can still run "crm configure show". Here is the configuration.
node $id="bc6bf61d-6b5f-4307-85f3-bf7bb11531bb" arsvr2 \
attributes standby="off"
node $id="bf0e7394-9684-42b9-893b-5a9a6ecddd7e" arsvr1 \
attributes standby="off"
primitive apache2 lsb:apache2 \
op start interval="0" timeout="60" \
op stop interval="0" timeout="120" start-delay="15" \
meta target-role="Started"
primitive drbd_mysql ocf:linbit:drbd \
params drbd_resource="r0" \
op monitor interval="15s"
primitive drbd_webfs ocf:linbit:drbd \
params drbd_resource="r1" \
op monitor interval="15s" \
op start interval="0" timeout="240" \
op stop interval="0" timeout="100"
primitive fs_mysql ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/r0" directory="/var/lib/mysql" fstype="ext4" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="120" \
meta target-role="Started"
primitive fs_webfs ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/r1" directory="/srv" fstype="ext4" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="120" \
meta target-role="Started"
primitive ip1 ocf:heartbeat:IPaddr2 \
params ip="138.214.240.193" nic="eth0" \
op monitor interval="5s" \
meta target-role="Started"
primitive ip1arp ocf:heartbeat:SendArp \
params ip="138.214.240.193" nic="eth0" \
meta target-role="Started"
primitive mysql ocf:heartbeat:mysql \
params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf"
user="mysql" group="mysql" log="/var/log/mysql.log"
pid="/var/lib/mysql/mysqld.pid" datadir="/var/lib/mysql"
socket="/var/run/mysqld/mysqld.sock" \
op monitor interval="30s" timeout="30s" \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
meta target-role="Started"
group MySQLDB fs_mysql mysql \
meta target-role="Started"
group WebServices ip1 ip1arp fs_webfs apache2 \
meta target-role="Started"
ms ms_drbd_mysql drbd_mysql \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
ms ms_drbd_webfs drbd_webfs \
meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started"
colocation apache2_with_ip inf: apache2 ip1
colocation apache2_with_mysql inf: apache2 ms_drbd_mysql:Master
colocation apache2_with_webfs inf: apache2 ms_drbd_webfs:Master
colocation fs_on_drbd inf: fs_mysql ms_drbd_mysql:Master
colocation ip_with_ip_arp inf: ip1 ip1arp
colocation mysql_on_drbd inf: MySQLDB ms_drbd_mysql:Master
colocation mysql_with_ip inf: MySQLDB ip1
colocation webfs_on_drbd inf: fs_webfs ms_drbd_webfs:Master
order apache2-after-arp inf: ip1arp:start apache2:start
order arp-after-ip inf: ip1:start ip1arp:start
order fs-mysql-after-drbd inf: ms_drbd_mysql:promote fs_mysql:start
order fs-webfs-after-drbd inf: ms_drbd_webfs:promote fs_webfs:start
order ip-after-mysql inf: mysql:start ip1:start
order mysql-after-fs-mysql inf: fs_mysql:start mysql:start property $id="cib-bootstrap-options" \
dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
cluster-infrastructure="Heartbeat" \
expected-quorum-votes="1" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
Liang Ma
Contractuel | Consultant | SED Systems Inc.
Ground Systems Analyst
Agence spatiale canadienne | Canadian Space Agency
6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
Courriel/E-mail : [liang.ma at space.gc.ca]
Site web/Web site : [www.space.gc.ca ]
-----Original Message-----
From: Ma, Liang
Sent: February 9, 2011 9:59 AM
To: 'The Pacemaker cluster resource manager'
Subject: Could not connect to the CIB: Remote node did not respond
Hi There,
After a network and power shutdown, my LAMP cluster servers were totally screwed up.
Now crm status gives me
crm status
============
Last updated: Wed Feb 9 09:44:17 2011
Stack: Heartbeat
Current DC: arsvr2 (bc6bf61d-6b5f-4307-85f3-bf7bb11531bb) - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 1 expected votes
4 Resources configured.
============
Online: [ arsvr1 arsvr2 ]
None of the resources comes up.
First I found a brain split in drbd disks. I fixed that and the drbd disks are health. I can mount them manually without problem.
However if I try anything to bring up a resource or edit cib or even a query, it gives me errors as following
crm resource start fs_mysql
Call cib_replace failed (-41): Remote node did not respond <null>
crm configure edit
Could not connect to the CIB: Remote node did not respond
ERROR: creating tmp shadow __crmshell.2540 failed
cibadmin -Q
Call cib_query failed (-41): Remote node did not respond <null>
Any idea what I can do to bring the cluster back?
Thank you,
Liang Ma
Contractuel | Consultant | SED Systems Inc.
Ground Systems Analyst
Agence spatiale canadienne | Canadian Space Agency
6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
Courriel/E-mail : [liang.ma at space.gc.ca]
Site web/Web site : [www.space.gc.ca ]
er
More information about the Pacemaker
mailing list