[Pacemaker] corosync stop and consequences

Wed Jun 26 07:34:47 EDT 2013

On 26/06/2013, at 12:24 AM, Digimer <lists at alteeve.ca> wrote:

> On 06/25/2013 07:29 AM, andreas graeper wrote:
>> hi,
>> maybe again and again the same question, please excuse.
>> 
>> two nodes (n1 active / n2 passive) and `service corosync stop` on active.
>> does the node, that is going down, tells the other that he has gone,
>> before he actually disconnect ?
>> so that there is no reason for n2 to kill n1 ?
>> 
>> on n2 after n1.corosync.stop :
>> 
>> drbd:promote OK
>> lvm:start OK
>> filesystem:start OK
>> but ipaddr2 still stopped ?
>> 
>> n1::drbd:demote works ?! so i would expect that all that depending
>> resource should have been
>> stopped successfully ?!
>> and if not, why ? why should ipaddr2:stop fail
>> and if it would fail, can filesystem:stop , lvm:stop , drbd:demote
>> succeed ?
>> 
>> how can i find some hint in logs why ipaddr fails to start ?
>> 
>> thanks
>> andreas
> 
> If you stop corosync while pacemaker is running,

The only way you can do this is to kill it with SIGKILL.
We use an corosync API that prevents corosync from exiting cleanly until pacemaker it stopped.

> it may well still get
> fenced (I've not tested this myself). If you want to gracefully shut
> down without a fence, migrate the services off of the node (if any were
> running), then stop pacemaker, then stop corosync and it should be fine.
> 
> -- 
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org