[Pacemaker] Cluster Test Suite: Bad News
Schmidt, Torsten
torsten.schmidt at tecdoc.net
Wed Oct 28 14:33:03 UTC 2009
Hi list,
after manually (positive) testing of my new cluster i've initiated one round of the cluster test suite with the following command:
./CTSlab.py --at-boot 1 --nodes 'mysqlha1 mysqlha2' --stack ais --syslog-facility local6 --schema pacemaker-1.0 --logfile /var/log/messages --stonith 1 --standby 1 --fencing 1 --stonith-type ssh --once 1
it seems there's a problem with openais but i'm not sure about the significance of these cts results;
after cts finished, my crm on both nodes was not reachable (but openais was running)
any hints or comments are very much appreciated.
here my CTS:BadNews filtered for one node only:
===============================================================
mysqlha1 attrd: ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 attrd: ERROR: ais_dispatch: AIS connection failed
mysqlha1 attrd: CRIT: attrd_ais_destroy: Lost connection to OpenAIS service!
mysqlha1 attrd: ERROR: attrd_cib_connection_destroy: Connection to the CIB terminated...
mysqlha1 crmd: ERROR: attrd_connection_destroy: Lost connection to attrd
mysqlha1 cib: ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 cib: ERROR: ais_dispatch: AIS connection failed
mysqlha1 cib: ERROR: cib_ais_destroy: AIS connection terminated
mysqlha1 crmd: CRIT: cib_native_dispatch: Lost connection to the CIB service [1655/callback].
mysqlha1 crmd: CRIT: cib_native_dispatch: Lost connection to the CIB service [1655/command].
mysqlha1 stonithd: ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 crmd: ERROR: crmd_cib_connection_destroy: Connection to the CIB terminated...
mysqlha1 stonithd: ERROR: ais_dispatch: AIS connection failed
mysqlha1 crmd: ERROR: do_log: FSA: Input I_ERROR from crmd_cib_connection_destroy() received in state S_IDLE
mysqlha1 stonithd: ERROR: AIS connection terminated
mysqlha1 crmd: info: do_state_transition: State transition S_IDLE -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=crmd_cib_connection_destroy ]
mysqlha1 crmd: ERROR: do_recover: Action A_RECOVER (0000000001000000) not supported
mysqlha1 crmd: ERROR: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY
mysqlha1 crmd: ERROR: verify_stopped: Resource res.stonith.ssh:1 was active at shutdown. You may ignore this error if it is unmanaged.
mysqlha1 crmd: ERROR: verify_stopped: Resource res.drbd.mysqldb:1 was active at shutdown. You may ignore this error if it is unmanaged.
mysqlha1 crmd: ERROR: do_exit: Could not recover from internal error
mysqlha1 drbd ERROR: mysqldb: Called drbdadm -c /etc/drbd.conf --peer mysqlha2 up mysqldb
mysqlha1 drbd ERROR: mysqldb: Exit code 1
mysqlha1 lrmd: info: RA output: (res.drbd.mysqldb:1:start:stderr) 0: Failure: (124) Device is attached to a disk (use detach first) Command 'drbdsetup 0 disk /dev/sdb /dev/sdb internal --set-defaults --create-device --on-io-error=detach' terminated with exit code 10 2009/10/27_16:57:43 ERROR: mysqldb: Called drbdadm -c /etc/drbd.conf --peer mysqlha2 up mysqldb
mysqlha1 drbd ERROR: mysqldb: Command output:
mysqlha1 lrmd: info: RA output: (res.drbd.mysqldb:1:start:stderr) 2009/10/27_16:57:43 ERROR: mysqldb: Exit code 1
mysqlha1 lrmd: info: RA output: (res.drbd.mysqldb:1:start:stderr) 2009/10/27_16:57:43 ERROR: mysqldb: Command output:
mysqlha1 IPaddr2 ERROR: Could not send gratuitous arps
mysqlha1 lrmd: info: RA output: (res.ip.stacked:start:stderr) 2009/10/27_16:57:50 ERROR: Could not send gratuitous arps
mysqlha1 crmd: info: te_fence_node: Executing reboot fencing operation (83) on mysqlha2 (timeout=60000)
mysqlha1 drbd ERROR: mysqlstack: Called drbdadm -c /etc/drbd.conf -S up mysqlstack
mysqlha1 drbd ERROR: mysqlstack: Exit code 1
mysqlha1 lrmd: info: RA output: (res.drbd.mysqlstack:0:start:stderr) 2009/10/27_17:02:00 ERROR: mysqlstack: Called drbdadm -c /etc/drbd.conf -S up mysqlstack
mysqlha1 drbd ERROR: mysqlstack: Command output:
mysqlha1 lrmd: info: RA output: (res.drbd.mysqlstack:0:start:stderr) 2009/10/27_17:02:00 ERROR: mysqlstack: Exit code 1
mysqlha1 lrmd: info: RA output: (res.drbd.mysqlstack:0:start:stderr) 2009/10/27_17:02:00 ERROR: mysqlstack: Command output:
mysqlha1 lrmd: info: RA output: (res.ip.stacked:start:stderr) 2009/10/27_17:02:04 ERROR: Could not send gratuitous arps
mysqlha1 lrmd: info: RA output: (res.ip.mysql:start:stderr) 2009/10/27_17:02:08 ERROR: Could not send gratuitous arps
mysqlha1 openais [crm ] ERROR: pcmk_wait_dispatch: Child process mgmtd terminated with signal 15 (pid=1684, core=false)
mysqlha1 crmd: CRIT: cib_native_dispatch: Lost connection to the CIB service [1679/callback].
mysqlha1 crmd: CRIT: cib_native_dispatch: Lost connection to the CIB service [1679/command].
mysqlha1 crmd: ERROR: verify_stopped: Resource res.ip.stacked was active at shutdown. You may ignore this error if it is unmanaged.
mysqlha1 crmd: ERROR: verify_stopped: Resource res.ocf.mysql was active at shutdown. You may ignore this error if it is unmanaged.
mysqlha1 crmd: ERROR: verify_stopped: Resource res.drbd.mysqlstack:0 was active at shutdown. You may ignore this error if it is unmanaged.
mysqlha1 crmd: ERROR: verify_stopped: Resource res.fs.mysql was active at shutdown. You may ignore this error if it is unmanaged.
mysqlha1 crmd: ERROR: verify_stopped: Resource res.ip.mysql was active at shutdown. You may ignore this error if it is unmanaged.
mysqlha1 lrmd: info: RA output: (res.ip.mysql:start:stderr) 2009/10/27_17:30:21 ERROR: Could not send gratuitous arps
mysqlha1 crmd: ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 crmd: ERROR: ais_dispatch: AIS connection failed
mysqlha1 crmd: ERROR: crm_ais_destroy: AIS connection terminated
mysqlha1 openais [crm ] ERROR: pcmk_wait_dispatch: Child process mgmtd terminated with signal 15 (pid=29800, core=false)
mysqlha1 crmd: ERROR: stonithd_op_result_ready: not signed on
mysqlha1 crmd: CRIT: tengine_stonith_connection_destroy: Fencing daemon connection failed
CTS: Overall Results:{'auditfail': 2, 'failure': 8, 'skipped': 0, 'success': 7, 'BadNews': 593}
===============================================================
environment + versions:
===============================================================
OS: RHEL 5.4 x86_64
drbd 8.3.4 compiled from source
openais.x86_64 0.80.6-8.el5_4.1
heartbeat.x86_64 3.0.0-33.2
resource-agents.x86_64 1.0-31.4
pacemaker.x86_64 1.0.5-4.1
pacemaker-libs.x86_64 1.0.5-4.1
===============================================================
/etc/ais/openais.conf:
===============================================================
totem {
version: 2
secauth: on
threads: 0
rrp_mode: active
clear_node_high_bit: yes
vsftype: none
token: 3000
join: 60
consensus: 1500
max_messages: 20
interface {
ringnumber: 0
bindnetaddr: 172.30.0.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
interface {
ringnumber: 1
bindnetaddr: 10.6.0.0
mcastaddr: 226.94.1.2
mcastport: 5405
}
}
logging {
debug: off
timestamp: on
to_stderr: yes
to_syslog: yes
syslog_facility: local7
to_file: no
}
amf {
mode: disabled
}
service {
ver: 0
name: pacemaker
use_mgmtd: yes
}
aisexec {
user: root
group: root
}
===============================================================
crm configure show:
===============================================================
node mysqlha1 \
attributes standby="off"
node mysqlha2 \
attributes standby="off"
primitive res.drbd.mysqldb ocf:linbit:drbd \
params drbd_resource="mysqldb" \
op monitor interval="59s" role="Master" timeout="30s" \
op monitor interval="60s" role="Slave" timeout="30s" \
meta target-role="Started" is-managed="true"
primitive res.drbd.mysqlstack ocf:linbit:drbd \
params drbd_resource="mysqlstack" \
op monitor interval="58s" role="Master" timeout="29s" \
op monitor interval="59s" role="Slave" timeout="28s" \
meta is-managed="true"
primitive res.fs.mysql ocf:heartbeat:Filesystem \
params device="/dev/drbd10" directory="/opt/data" fstype="ext3" \
op monitor interval="20" timeout="10s" \
op start interval="0" timeout="5s"
primitive res.ip.mysql ocf:heartbeat:IPaddr2 \
params ip="172.30.2.10" nic="eth0" cidr_netmask="22" \
op monitor interval="2s" timeout="1s"
primitive res.ip.stacked ocf:heartbeat:IPaddr2 \
params ip="10.6.0.132" nic="eth1" cidr_netmask="24" \
op monitor interval="3s" timeout="2s" \
meta is-managed="true"
primitive res.ocf.mysql ocf:heartbeat:mysql \
params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" user="mysql" group="mysql" log="/var/log/mysqld.log" pid="/var/run/mysqld/mysqld.pid" datadir="/opt/data" \
op monitor interval="30s" timeout="10s"
primitive res.stonith.ssh stonith:ssh \
params hostlist="mysqlHA1 mysqlHA2"
group group.mysql res.fs.mysql res.ip.mysql res.ocf.mysql \
meta is-managed="true"
ms ms.drbd.mysqldb res.drbd.mysqldb \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally_unique="false"
ms ms.drbd.mysqlstack res.drbd.mysqlstack \
meta master-max="1" master-node-max="1" clone-max="1" clone-node-max="1" notify="true" globally_unique="false"
clone clone.res.stonith.ssh res.stonith.ssh \
params globally-unique="false"
colocation co.ms.drbd.mysqlstack_on_ms.drbd.mysqldb inf: ms.drbd.mysqlstack ms.drbd.mysqldb:Master
colocation co.ms.drbd.mysqlstack_on_res.ip.stacked inf: ms.drbd.mysqlstack res.ip.stacked
colocation co.res.fs.mysql_on_ms.drbd.mysqlstack_master inf: group.mysql ms.drbd.mysqlstack:Master
colocation co.res.ip.stacked_on_ms.drbd.mysqldb_master inf: res.ip.stacked ms.drbd.mysqldb:Master
order o.ms.drbd.mysqldb_before_ms.drbd.mysqlstack inf: ms.drbd.mysqldb:promote ms.drbd.mysqlstack:start
order o.ms.drbd.mysqlstack_before_res.fs.mysql inf: ms.drbd.mysqlstack:promote res.fs.mysql:start
property $id="cib-bootstrap-options" \
stonith-enabled="true" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
last-lrm-refresh="1256717460" \
dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
cluster-infrastructure="openais"
===============================================================
Mit freundlichen Grüßen / with kind regards
Torsten Schmidt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20091028/534d216f/attachment-0001.html>
More information about the Pacemaker
mailing list