[Pacemaker] Unexpected resource restarts when node comes online

David Vossel dvossel at redhat.com
Tue Aug 21 14:40:43 UTC 2012


----- Original Message -----
> From: "Gareth Davis" <Gareth.Davis at ipaccess.com>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Tuesday, August 21, 2012 9:01:39 AM
> Subject: [Pacemaker] Unexpected resource restarts when node comes online
> 
> Hi,
> 
> Quick bit of back ground, I've recently updated from pacemaker 1.0 to
> 1.1.5 because of an issue where cloned resources be restarted
> unexpectedly
> when any of the nodes went into standby or failed
> (https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2153),
> 1.1.5
> certainly fixes this issue.
> 
> But now I've got is all up and running I've noticed that on returning
> a
> node from standby to online a restart of my application server is
> triggered.

I took a quick look at your config.  My guess is that the following order constraint is causing the restart of NOSServiceManager0 when the node comes back on.

order order_NOSServiceManager0_after_NOSFileSystemCluster inf: NOSFileSystemCluster NOSServiceManager0

I'm thinking the interleave clone resource option might help with this.  http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch10s02s02.html

-- Vossel

> I'm afraid the config is complex involving a couple of DRBD pairs,
> four
> clones, and a glassfish application server NOSServiceManager0.
> 
> Output of crm configure show.
> https://dl.dropbox.com/u/5427964/config.txt
> 
> 
> There are 2 nodes in the cluster (oamdev-vm11 & oamdev-vm12) all the
> non-cloned resources are running on oamdev-vm12.
> 
> On putting oamdev-vm11 into standby nothing unexpected happens, but
> on
> bringing it back online causes NOSServiceManager0 to be stopped and
> started.
> 
> crm_report output, the time span should include the standby and
> online
> events.
> https://dl.dropbox.com/u/5427964/pcmk-Tue-21-Aug-2012.tar.bz2
> 
> I'm at a bit of a loss as to how to debug this, I suspect I've messed
> up
> the ordering in some way, any pointers?
> 
> Gareth Davis
> 
> 
> 
> 
> 
> 
> This message contains confidential information and may be privileged.
> If you are not the intended recipient, please notify the sender and
> delete the message immediately.
> 
> ip.access Ltd, registration number 3400157, Building 2020,
> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United
> Kingdom
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 




More information about the Pacemaker mailing list