[Pacemaker] Unexpected resource restarts when node comes online

Gareth Davis Gareth.Davis at ipaccess.com
Wed Aug 22 03:52:46 EDT 2012


I can't see this happening, but I'll have a go playing with this values
tomorrow.

I'll let you know how it goes.

Thanks
Gareth

On 21/08/2012 16:56, "Jake Smith" <jsmith at argotec.com> wrote:

>
>
>
>----- Original Message -----
>> From: "Gareth Davis" <Gareth.Davis at ipaccess.com>
>> To: "The Pacemaker cluster resource manager"
>><pacemaker at oss.clusterlabs.org>
>> Sent: Tuesday, August 21, 2012 11:28:53 AM
>> Subject: Re: [Pacemaker] Unexpected resource restarts when node comes
>>online
>> 
>> From the documentation it seem that the default is actually
>> interleave=trueŠwhich is I think the desired setting, i.e. Only wait
>> for
>> the local instance rather than all the clones. I've tried with
>> interleave=true & falseŠ doesn't seem to be cause of the problem.
>> 
>> I'll continue with interleave="true" on all clones.
>> 
>> I've been playing around with ptest and it I think the fs1_group is
>> being
>> restarted, which in turn restarts NOSFileSystemCluster etc.
>> 
>
>I know it's pretty obvious but the location of your DRBD masters doesn't
>change between standby and online do they?
>
>Was thinking of a score problem between stickiness and placement/advisory
>location maybe...
>
>Jake
>
>> Gareth
>> 
>> On 21/08/2012 15:40, "David Vossel" <dvossel at redhat.com> wrote:
>> 
>> >----- Original Message -----
>> >> From: "Gareth Davis" <Gareth.Davis at ipaccess.com>
>> >> To: "The Pacemaker cluster resource manager"
>> >><pacemaker at oss.clusterlabs.org>
>> >> Sent: Tuesday, August 21, 2012 9:01:39 AM
>> >> Subject: [Pacemaker] Unexpected resource restarts when node comes
>> >> online
>> >> 
>> >> Hi,
>> >> 
>> >> Quick bit of back ground, I've recently updated from pacemaker 1.0
>> >> to
>> >> 1.1.5 because of an issue where cloned resources be restarted
>> >> unexpectedly
>> >> when any of the nodes went into standby or failed
>> >> (https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2153),
>> >> 1.1.5
>> >> certainly fixes this issue.
>> >> 
>> >> But now I've got is all up and running I've noticed that on
>> >> returning
>> >> a
>> >> node from standby to online a restart of my application server is
>> >> triggered.
>> >
>> >I took a quick look at your config.  My guess is that the following
>> >order
>> >constraint is causing the restart of NOSServiceManager0 when the
>> >node
>> >comes back on.
>> >
>> >order order_NOSServiceManager0_after_NOSFileSystemCluster inf:
>> >NOSFileSystemCluster NOSServiceManager0
>> >
>> >I'm thinking the interleave clone resource option might help with
>> >this.
>> 
>>>http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explai
>>>ne
>> >d/ch10s02s02.html
>> >
>> >-- Vossel
>> >
>> >> I'm afraid the config is complex involving a couple of DRBD pairs,
>> >> four
>> >> clones, and a glassfish application server NOSServiceManager0.
>> >> 
>> >> Output of crm configure show.
>> >> https://dl.dropbox.com/u/5427964/config.txt
>> >> 
>> >> 
>> >> There are 2 nodes in the cluster (oamdev-vm11 & oamdev-vm12) all
>> >> the
>> >> non-cloned resources are running on oamdev-vm12.
>> >> 
>> >> On putting oamdev-vm11 into standby nothing unexpected happens,
>> >> but
>> >> on
>> >> bringing it back online causes NOSServiceManager0 to be stopped
>> >> and
>> >> started.
>> >> 
>> >> crm_report output, the time span should include the standby and
>> >> online
>> >> events.
>> >> https://dl.dropbox.com/u/5427964/pcmk-Tue-21-Aug-2012.tar.bz2
>> >> 
>> >> I'm at a bit of a loss as to how to debug this, I suspect I've
>> >> messed
>> >> up
>> >> the ordering in some way, any pointers?
>> >> 
>> >> Gareth Davis
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> This message contains confidential information and may be
>> >> privileged.
>> >> If you are not the intended recipient, please notify the sender
>> >> and
>> >> delete the message immediately.
>> >> 
>> >> ip.access Ltd, registration number 3400157, Building 2020,
>> >> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United
>> >> Kingdom
>> >> 
>> >> _______________________________________________
>> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >> 
>> >> Project Home: http://www.clusterlabs.org
>> >> Getting started:
>> >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> >> Bugs: http://bugs.clusterlabs.org
>> >> 
>> >
>> >_______________________________________________
>> >Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> >http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >
>> >Project Home: http://www.clusterlabs.org
>> >Getting started:
>> >http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> >Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> 
>> 
>> 
>> 
>> This message contains confidential information and may be privileged.
>> If you are not the intended recipient, please notify the sender and
>> delete the message immediately.
>> 
>> ip.access Ltd, registration number 3400157, Building 2020,
>> Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United
>> Kingdom
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>
>_______________________________________________
>Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: http://bugs.clusterlabs.org






This message contains confidential information and may be privileged. If you are not the intended recipient, please notify the sender and delete the message immediately.

ip.access Ltd, registration number 3400157, Building 2020, 
Cambourne Business Park, Cambourne, Cambridge CB23 6DW, United Kingdom




More information about the Pacemaker mailing list