[Pacemaker] mysql RA constantly restarting db

Thu Jul 22 19:45:45 UTC 2010

Hello,
I am new to pacemaker and struggling with the somewhat limited
documentation. I looked through the archives and didn't find anything that
matched my problem. I have brand new pacemaker setup running on CentOS 5.5.
I am using the below config file to start up mysql which is also a brand new
build. Right now the cluster is only running on one node while I try to
isolate this problem. This is a brand new cib file as well. The cluster
starts up but then every 30 seconds or so I see it restart mysql.  If I stop
heartbeat and bring up mysql by itself it starts up just fine. Its driving
me batty so I thought I would post it here and see if someone was able to
help. What I see in syslog from heartbeat is:

ul 22 15:34:09 sipl-mysql-109 lrmd: [11182]: info: rsc:d_mysql:69: start
Jul 22 15:34:11 sipl-mysql-109 lrmd: [11182]: info: RA output:
(ip_db:start:stderr) ARPING 10.200.131.9 from 10.200.131.9 eth0 Sent 5
probes (5 broadcast(s)) Received 0 response(s)
Jul 22 15:34:13 sipl-mysql-109 mysql[14915]: [15086]: INFO: MySQL started
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: process_lrm_event: LRM
operation d_mysql_start_0 (call=69, rc=0, cib-update=105, confirmed=true) ok
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: match_graph_event:
Action d_mysql_start_0 (6) confirmed on sipl-mysql-109 (rc=0)
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: te_rsc_command:
Initiating action 1: monitor d_mysql_monitor_10000 on sipl-mysql-109 (local)
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: do_lrm_rsc_op:
Performing key=1:14:0:989206b7-461a-42db-a2a7-7b447bd6c5b3
op=d_mysql_monitor_10000 )
Jul 22 15:34:13 sipl-mysql-109 lrmd: [11182]: info: rsc:d_mysql:70: monitor
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: te_rsc_command:
Initiating action 8: start ip_db_start_0 on sipl-mysql-109 (local)
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: do_lrm_rsc_op:
Performing key=8:14:0:989206b7-461a-42db-a2a7-7b447bd6c5b3 op=ip_db_start_0
)
Jul 22 15:34:13 sipl-mysql-109 lrmd: [11182]: info: rsc:ip_db:71: start
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: info: process_lrm_event: LRM
operation d_mysql_monitor_10000 (call=70, rc=7, cib-update=106,
confirmed=false) not running
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: WARN: status_from_rc: Action 1
(d_mysql_monitor_10000) on sipl-mysql-109 failed (target: 0 vs. rc: 7):
Error
Jul 22 15:34:13 sipl-mysql-109 crmd: [11185]: WARN: update_failcount:
Updating failcount for d_mysql on sipl-mysql-109 after failed monitor.

The output of crm configure show is:

primitive d_mysql ocf:heartbeat:mysql \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="10" timeout="30" depth="0" param
binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" datadir="/var/lib/mysql"
user="mysql" pid="/var/run/mysqld/mysql.pid"
socket="/var/lib/mysql/mysql.sock"
primitive ip_db ocf:heartbeat:IPaddr2 \
params ip="10.200.131.9" cidr_netmask="32" \
op monitor interval="30s" nic="eth0"
group sv_db d_mysql ip_db
property $id="cib-bootstrap-options" \
stonith-enabled="false" \
no-quorum-policy="ignore" \
start-failure-is-fatal="false" \
expected-quorum-votes="2" \
dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \
cluster-infrastructure="Heartbeat"
rsc_defaults $id="rsc_defaults-options" \
migration-threshold="20" \
failure-timeout="20"

My versions are as follows:

[root at sipl-mysql-109 rc0.d]# rpm -qa | egrep "coro|pacemaker|heart"
corosynclib-1.2.5-1.3.el5
corosync-1.2.5-1.3.el5
corosync-1.2.5-1.3.el5
heartbeat-3.0.3-2.3.el5
pacemaker-1.0.9.1-1.11.el5
pacemaker-1.0.9.1-1.11.el5
corosynclib-1.2.5-1.3.el5
heartbeat-libs-3.0.3-2.3.el5
heartbeat-3.0.3-2.3.el5
pacemaker-libs-1.0.9.1-1.11.el5
heartbeat-libs-3.0.3-2.3.el5
pacemaker-libs-1.0.9.1-1.11.el5

rpm -qa | grep resource
resource-agents-1.0.3-2.6.el5

[root at sipl-mysql-109 rc0.d]# cat /etc/redhat-release
CentOS release 5.5 (Final)

[root at sipl-mysql-109 rc0.d]# uname -r
2.6.18-194.8.1.el5

[root at sipl-mysql-109 rc0.d]# mysql -V
mysql  Ver 14.14 Distrib 5.1.48, for unknown-linux-gnu (x86_64) using
readline 5.1

My ha.cf looks like:

autojoin none
mcast eth0 227.0.0.10 694 1 0
warntime 5
deadtime 15
initdead 60
keepalive 5
auto_failback off
node sipl-mysql-109
node sipl-mysql-209
crm on

Mysql show the following in it's error log:

100722 15:33:57 [Note] Plugin 'FEDERATED' is disabled.
100722 15:33:57  InnoDB: Started; log sequence number 0 44233
100722 15:33:57 [Note] Event Scheduler: Loaded 0 events
100722 15:33:57 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.48-community-log'  socket: '/var/lib/mysql/mysql.sock'  port:
3306  MySQL Community Server (GPL)
100722 15:34:01 [Note] /usr/sbin/mysqld: Normal shutdown

100722 15:34:01 [Note] Event Scheduler: Purging the queue. 0 events
100722 15:34:01  InnoDB: Starting shutdown...
100722 15:34:02  InnoDB: Shutdown completed; log sequence number 0 44233
100722 15:34:02 [Note] /usr/sbin/mysqld: Shutdown complete

100722 15:34:02 mysqld_safe mysqld from pid file /var/run/mysql/mysqld.pid
ended
100722 15:34:03 mysqld_safe Starting mysqld daemon with databases from
/var/lib/mysql
100722 15:34:03 [Warning] '--skip-locking' is deprecated and will be removed
in a future release. Please use '--skip-external-locking' instead.
100722 15:34:03 [Note] Plugin 'FEDERATED' is disabled.
100722 15:34:03  InnoDB: Started; log sequence number 0 44233
100722 15:34:03 [Note] Event Scheduler: Loaded 0 events
100722 15:34:03 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.1.48-community-log'  socket: '/var/lib/mysql/mysql.sock'  port:
3306  MySQL Community Server (GPL)

Any help would be greatly appreciated. Thanks in advance.
F.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100722/4574aa75/attachment-0001.html>