[Pacemaker] Resource group restarts every 15 minutes

Charles Ulrich charles at bityard.net
Sun Nov 20 18:06:04 EST 2011


Hello,

First off, I'm brand new to pacemaker and all of its tools. I'm trying
to come up to speed as quickly as I can, but understand that my
knowledge is probably lacking in some key areas. As Murphy would have
it, I've come across a problem that Google has not been able to help
me with.

Here's the setup: Two machines. eldon and elisa with heartbeat and
drbd configured. eldon is running a resource group called "www", which
contains apache, an IP address, and /dev/www mounted from a drbd
device. (There's a "mysql" resource group on elisa, but that appears
to be functioning normally for now.)

Here's the problem: The www resource group on eldon keeps getting
restarted every 16 minutes. (Up for 15, down for 1.) Based on the logs
on elisa, I believe this is happening whenever the
cluster-recheck-interval is hit, which defaults to 15 minutes. I
believe that Pacemaker thinks the configuration (or something) in the
resource group changed and initiates a restart at every recheck
interval. These are the log messages from elisa that lead me down this
line of reasoning:

Nov 19 13:44:02 elisa pengine: [1460]: notice: check_rsc_parameters:
Forcing restart of www on eldon, type changed: Filesystem -> <null>
Nov 19 13:44:02 elisa pengine: [1460]: notice: check_rsc_parameters:
Forcing restart of www on eldon, class changed: ocf -> <null>
Nov 19 13:44:02 elisa pengine: [1460]: notice: check_rsc_parameters:
Forcing restart of www on eldon, provider changed: heartbeat -> <null>

What might be causing this? I've included all of the relevant
information that I can think of below. If there's anything else I can
provide that would help, let me know. If it's an RTFM thing, I'd be
grateful if you could also point me towards the right FM to R.

node eldon \
        attributes standby="off"
node elisa \
        attributes standby="off"
primitive apache lsb:apache2
primitive drbd_mysql ocf:linbit:drbd \
        params drbd_resource="mysql" \
        op monitor interval="15s" \
        op start interval="0" timeout="240" \
        op stop interval="0" timeout="100"
primitive drbd_www ocf:linbit:drbd \
        params drbd_resource="www" \
        op monitor interval="15s" \
        op start interval="0" timeout="240" \
        op stop interval="0" timeout="100"
primitive fs_mysql ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/mysql"
directory="/var/lib/mysql" fstype="ext4"
options="noatime,nodev,nosuid,noexec" \
        op start interval="0" timeout="60" \
        op stop interval="0" timeout="60"
primitive fs_www ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/www" directory="/var/www"
fstype="ext4" options="noatime,nodev,nosuid" \
        op start interval="0" timeout="60" \
        op stop interval="0" timeout="60"
primitive ip_mysql ocf:heartbeat:IPaddr2 \
        params ip="10.0.2.10"
primitive ip_www ocf:heartbeat:IPaddr2 \
        params ip="207.179.127.50"
primitive mysqld lsb:mysql
group mysql fs_mysql ip_mysql mysqld \
        meta target-role="Started" is-managed="true"
group www fs_www ip_www apache \
        meta target-role="Started" is-managed="true"
ms ms_drbd_mysql drbd_mysql \
        meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started"
ms ms_drbd_www drbd_www \
        meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true" target-role="Started"
location loc_mysql mysql 200: elisa
location loc_www www 200: eldon
colocation mysql_on_drbd inf: mysql ms_drbd_mysql:Master
colocation www_on_drbd inf: www ms_drbd_www:Master
order mysql_after_drbd inf: ms_drbd_mysql:promote mysql:start
order www_after_drbd inf: ms_drbd_www:promote www:start
property $id="cib-bootstrap-options" \
        dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        no-quorum-policy="ignore" \
        stonith-enabled="false"


crm(live)# status
============
Last updated: Sat Nov 19 13:34:25 2011
Stack: openais
Current DC: elisa - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
2 Nodes configured, 2 expected votes
4 Resources configured.
============

Online: [ eldon elisa ]

 Resource Group: mysql
     fs_mysql	(ocf::heartbeat:Filesystem):	Started elisa
     ip_mysql	(ocf::heartbeat:IPaddr2):	Started elisa
     mysqld	(lsb:mysql):	Started elisa
 Master/Slave Set: ms_drbd_mysql
     Masters: [ elisa ]
     Slaves: [ eldon ]
 Master/Slave Set: ms_drbd_www
     Masters: [ eldon ]
     Slaves: [ elisa ]
 Resource Group: www
     fs_www	(ocf::heartbeat:Filesystem):	Started eldon
     ip_www	(ocf::heartbeat:IPaddr2):	Started eldon
     apache	(lsb:apache2):	Started eldon

Failed actions:
    drbd_mysql_monitor_0 (node=elisa, call=2, rc=6, status=complete):
not configured
    drbd_mysql_monitor_0 (node=eldon, call=2, rc=6, status=complete):
not configured
    fs_mysql_start_0 (node=eldon, call=8, rc=5, status=complete): not installed

I've also uploaded the syslogs of the restart event here (they're
rather large and I don't wish to spam the mailing list further than
necessary):

  eldon: http://pastebin.com/raw.php?i=p6Kmct9f
  elisa: http://pastebin.com/raw.php?i=mwddDxKi

Many thanks,
Charles




More information about the Pacemaker mailing list