[Pacemaker] Cluster Test Suite: Bad News

Schmidt, Torsten torsten.schmidt at tecdoc.net
Wed Oct 28 10:33:03 EDT 2009


Hi list,

after manually (positive) testing of my new cluster i've initiated one round of the cluster test suite with the following command:

./CTSlab.py --at-boot 1 --nodes 'mysqlha1 mysqlha2' --stack ais --syslog-facility local6 --schema pacemaker-1.0 --logfile /var/log/messages --stonith 1 --standby 1 --fencing 1 --stonith-type ssh --once 1

it seems there's a problem with openais but i'm not sure about the significance of these cts results; 
after cts finished, my crm on both nodes was not reachable (but openais was running)

any hints or comments are very much appreciated.


here my CTS:BadNews filtered for one node only:
===============================================================
mysqlha1 attrd:  ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 attrd:  ERROR: ais_dispatch: AIS connection failed
mysqlha1 attrd:  CRIT: attrd_ais_destroy: Lost connection to OpenAIS service!
mysqlha1 attrd:  ERROR: attrd_cib_connection_destroy: Connection to the CIB terminated...
mysqlha1 crmd:  ERROR: attrd_connection_destroy: Lost connection to attrd
mysqlha1 cib:  ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 cib:  ERROR: ais_dispatch: AIS connection failed
mysqlha1 cib:  ERROR: cib_ais_destroy: AIS connection terminated
mysqlha1 crmd:  CRIT: cib_native_dispatch: Lost connection to the CIB service [1655/callback].
mysqlha1 crmd:  CRIT: cib_native_dispatch: Lost connection to the CIB service [1655/command].
mysqlha1 stonithd:  ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 crmd:  ERROR: crmd_cib_connection_destroy: Connection to the CIB terminated...
mysqlha1 stonithd:  ERROR: ais_dispatch: AIS connection failed
mysqlha1 crmd:  ERROR: do_log: FSA: Input I_ERROR from crmd_cib_connection_destroy() received in state S_IDLE
mysqlha1 stonithd:  ERROR: AIS connection terminated
mysqlha1 crmd:  info: do_state_transition: State transition S_IDLE -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=crmd_cib_connection_destroy ]
mysqlha1 crmd:  ERROR: do_recover: Action A_RECOVER (0000000001000000) not supported
mysqlha1 crmd:  ERROR: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY
mysqlha1 crmd:  ERROR: verify_stopped: Resource res.stonith.ssh:1 was active at shutdown.  You may ignore this error if it is unmanaged.
mysqlha1 crmd:  ERROR: verify_stopped: Resource res.drbd.mysqldb:1 was active at shutdown.  You may ignore this error if it is unmanaged.
mysqlha1 crmd:  ERROR: do_exit: Could not recover from internal error
mysqlha1 drbd ERROR: mysqldb: Called drbdadm -c /etc/drbd.conf --peer mysqlha2 up mysqldb
mysqlha1 drbd ERROR: mysqldb: Exit code 1
mysqlha1 lrmd:  info: RA output: (res.drbd.mysqldb:1:start:stderr) 0: Failure: (124) Device is attached to a disk (use detach first) Command 'drbdsetup 0 disk /dev/sdb /dev/sdb internal --set-defaults --create-device --on-io-error=detach' terminated with exit code 10 2009/10/27_16:57:43 ERROR: mysqldb: Called drbdadm -c /etc/drbd.conf --peer mysqlha2 up mysqldb
mysqlha1 drbd ERROR: mysqldb: Command output:
mysqlha1 lrmd:  info: RA output: (res.drbd.mysqldb:1:start:stderr) 2009/10/27_16:57:43 ERROR: mysqldb: Exit code 1
mysqlha1 lrmd:  info: RA output: (res.drbd.mysqldb:1:start:stderr) 2009/10/27_16:57:43 ERROR: mysqldb: Command output:
mysqlha1 IPaddr2 ERROR: Could not send gratuitous arps
mysqlha1 lrmd:  info: RA output: (res.ip.stacked:start:stderr) 2009/10/27_16:57:50 ERROR: Could not send gratuitous arps
mysqlha1 crmd:  info: te_fence_node: Executing reboot fencing operation (83) on mysqlha2 (timeout=60000)
mysqlha1 drbd ERROR: mysqlstack: Called drbdadm -c /etc/drbd.conf -S up mysqlstack
mysqlha1 drbd ERROR: mysqlstack: Exit code 1
mysqlha1 lrmd:  info: RA output: (res.drbd.mysqlstack:0:start:stderr) 2009/10/27_17:02:00 ERROR: mysqlstack: Called drbdadm -c /etc/drbd.conf -S up mysqlstack
mysqlha1 drbd ERROR: mysqlstack: Command output:
mysqlha1 lrmd:  info: RA output: (res.drbd.mysqlstack:0:start:stderr) 2009/10/27_17:02:00 ERROR: mysqlstack: Exit code 1
mysqlha1 lrmd:  info: RA output: (res.drbd.mysqlstack:0:start:stderr) 2009/10/27_17:02:00 ERROR: mysqlstack: Command output:
mysqlha1 lrmd:  info: RA output: (res.ip.stacked:start:stderr) 2009/10/27_17:02:04 ERROR: Could not send gratuitous arps
mysqlha1 lrmd:  info: RA output: (res.ip.mysql:start:stderr) 2009/10/27_17:02:08 ERROR: Could not send gratuitous arps
mysqlha1 openais [crm  ] ERROR: pcmk_wait_dispatch: Child process mgmtd terminated with signal 15 (pid=1684, core=false)
mysqlha1 crmd:  CRIT: cib_native_dispatch: Lost connection to the CIB service [1679/callback].
mysqlha1 crmd:  CRIT: cib_native_dispatch: Lost connection to the CIB service [1679/command].
mysqlha1 crmd:  ERROR: verify_stopped: Resource res.ip.stacked was active at shutdown.  You may ignore this error if it is unmanaged.
mysqlha1 crmd:  ERROR: verify_stopped: Resource res.ocf.mysql was active at shutdown.  You may ignore this error if it is unmanaged.
mysqlha1 crmd:  ERROR: verify_stopped: Resource res.drbd.mysqlstack:0 was active at shutdown.  You may ignore this error if it is unmanaged.
mysqlha1 crmd:  ERROR: verify_stopped: Resource res.fs.mysql was active at shutdown.  You may ignore this error if it is unmanaged.
mysqlha1 crmd:  ERROR: verify_stopped: Resource res.ip.mysql was active at shutdown.  You may ignore this error if it is unmanaged.
mysqlha1 lrmd:  info: RA output: (res.ip.mysql:start:stderr) 2009/10/27_17:30:21 ERROR: Could not send gratuitous arps
mysqlha1 crmd:  ERROR: ais_dispatch: Receiving message body failed: (-1) unknown: Resource temporarily unavailable (11)
mysqlha1 crmd:  ERROR: ais_dispatch: AIS connection failed
mysqlha1 crmd:  ERROR: crm_ais_destroy: AIS connection terminated
mysqlha1 openais [crm  ] ERROR: pcmk_wait_dispatch: Child process mgmtd terminated with signal 15 (pid=29800, core=false)
mysqlha1 crmd:  ERROR: stonithd_op_result_ready: not signed on
mysqlha1 crmd:  CRIT: tengine_stonith_connection_destroy: Fencing daemon connection failed

CTS: Overall Results:{'auditfail': 2, 'failure': 8, 'skipped': 0, 'success': 7, 'BadNews': 593}
===============================================================


environment + versions: 
===============================================================
OS: RHEL 5.4 x86_64 
drbd 8.3.4 compiled from source 
openais.x86_64 0.80.6-8.el5_4.1 
heartbeat.x86_64 3.0.0-33.2 
resource-agents.x86_64 1.0-31.4 
pacemaker.x86_64 1.0.5-4.1 
pacemaker-libs.x86_64 1.0.5-4.1 
===============================================================


/etc/ais/openais.conf:
===============================================================
totem {
        version: 2
        secauth: on
        threads: 0
        rrp_mode: active
        clear_node_high_bit: yes
        vsftype: none
        token: 3000
        join: 60
        consensus: 1500
        max_messages: 20
        interface {
                ringnumber: 0
                bindnetaddr: 172.30.0.0
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
        interface {
                ringnumber: 1
                bindnetaddr: 10.6.0.0
                mcastaddr: 226.94.1.2
                mcastport: 5405
        }
}
logging {
        debug: off
        timestamp: on
        to_stderr: yes
        to_syslog: yes
        syslog_facility: local7
        to_file: no
}
amf {
        mode: disabled
}
service {
  ver:       0
  name:      pacemaker
  use_mgmtd: yes
}
aisexec {
  user:   root
  group:  root
}
===============================================================


crm configure show:
===============================================================
node mysqlha1 \
	attributes standby="off"
node mysqlha2 \
	attributes standby="off"
primitive res.drbd.mysqldb ocf:linbit:drbd \
	params drbd_resource="mysqldb" \
	op monitor interval="59s" role="Master" timeout="30s" \
	op monitor interval="60s" role="Slave" timeout="30s" \
	meta target-role="Started" is-managed="true"
primitive res.drbd.mysqlstack ocf:linbit:drbd \
	params drbd_resource="mysqlstack" \
	op monitor interval="58s" role="Master" timeout="29s" \
	op monitor interval="59s" role="Slave" timeout="28s" \
	meta is-managed="true"
primitive res.fs.mysql ocf:heartbeat:Filesystem \
	params device="/dev/drbd10" directory="/opt/data" fstype="ext3" \
	op monitor interval="20" timeout="10s" \
	op start interval="0" timeout="5s"
primitive res.ip.mysql ocf:heartbeat:IPaddr2 \
	params ip="172.30.2.10" nic="eth0" cidr_netmask="22" \
	op monitor interval="2s" timeout="1s"
primitive res.ip.stacked ocf:heartbeat:IPaddr2 \
	params ip="10.6.0.132" nic="eth1" cidr_netmask="24" \
	op monitor interval="3s" timeout="2s" \
	meta is-managed="true"
primitive res.ocf.mysql ocf:heartbeat:mysql \
	params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" user="mysql" group="mysql" log="/var/log/mysqld.log" pid="/var/run/mysqld/mysqld.pid" datadir="/opt/data" \
	op monitor interval="30s" timeout="10s"
primitive res.stonith.ssh stonith:ssh \
	params hostlist="mysqlHA1 mysqlHA2"
group group.mysql res.fs.mysql res.ip.mysql res.ocf.mysql \
	meta is-managed="true"
ms ms.drbd.mysqldb res.drbd.mysqldb \
	meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally_unique="false"
ms ms.drbd.mysqlstack res.drbd.mysqlstack \
	meta master-max="1" master-node-max="1" clone-max="1" clone-node-max="1" notify="true" globally_unique="false"
clone clone.res.stonith.ssh res.stonith.ssh \
	params globally-unique="false"
colocation co.ms.drbd.mysqlstack_on_ms.drbd.mysqldb inf: ms.drbd.mysqlstack ms.drbd.mysqldb:Master
colocation co.ms.drbd.mysqlstack_on_res.ip.stacked inf: ms.drbd.mysqlstack res.ip.stacked
colocation co.res.fs.mysql_on_ms.drbd.mysqlstack_master inf: group.mysql ms.drbd.mysqlstack:Master
colocation co.res.ip.stacked_on_ms.drbd.mysqldb_master inf: res.ip.stacked ms.drbd.mysqldb:Master
order o.ms.drbd.mysqldb_before_ms.drbd.mysqlstack inf: ms.drbd.mysqldb:promote ms.drbd.mysqlstack:start
order o.ms.drbd.mysqlstack_before_res.fs.mysql inf: ms.drbd.mysqlstack:promote res.fs.mysql:start
property $id="cib-bootstrap-options" \
	stonith-enabled="true" \
	expected-quorum-votes="2" \
	no-quorum-policy="ignore" \
	last-lrm-refresh="1256717460" \
	dc-version="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7" \
	cluster-infrastructure="openais"
===============================================================


Mit freundlichen Grüßen / with kind regards

Torsten Schmidt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20091028/534d216f/attachment.html>


More information about the Pacemaker mailing list