[Pacemaker] How to perform a clean shutdown of Pacemaker in the event of network connection loss

Wed Jul 24 04:40:15 EDT 2013

I did not enable fencing. I observe the process running and see that when
the node is up, I will see the following processes:

Corosync
--------------
/usr/sbin/corosync

Pacemaker
-----------------
/usr/libexec/pacemaker/lrmd
/usr/libexec/pacemaker/pengine
pacemakerd
/usr/libexec/pacemaker/stonith
/usr/libexec/pacemaker/cib
/usr/libexec/pacemaker/crmd

If I were to shutdown the network connection of any node and then list out
the processes, I will see that "/usr/sbin/corosync" is no longer running
 and for Pacemaker, the following processes are left:

Pacemaker
-----------------
/usr/libexec/pacemaker/lrmd
/usr/libexec/pacemaker/pengine

If there is no network connectivity loss and I perform a clean shutdown, I
do not see any of the processes listed for Corosync and Pacemaker. I tried
to kill the remaining process after network connection is lost but that
does not prevent the fallen node from getting back the resource if it used
to be holding it before going down.

Is there a way to perform a clean shutdown if pacemaker was shutdown
improperly?

On Wed, Jul 24, 2013 at 8:21 AM, Andrew Beekhof <andrew at beekhof.net> wrote:

>
> On 24/07/2013, at 9:54 AM, Tan Tai hock <taihock at gmail.com> wrote:
>
> > No I did not. It seems like corosync and pacemaker stop running when the
> network connection is lost.
>
> Do you have fencing enabled?
> If not, I'd be surprised if corosync or pacemaker stopped running.
>
> > I am trying to simulate a scenario whereby a node which started the
> resource loses network connection and observe how it reacts upon joining
> back the cluster. Is there any proper way to shutdown both corosync and
> pacemaker in such scenario?
>
> They are not supposed to stop running just because connectivity was lost.
>
> >
> > On Jul 24, 2013 6:55 AM, "Andrew Beekhof" <andrew at beekhof.net> wrote:
> >
> > On 23/07/2013, at 11:28 AM, Tan Tai hock <taihock at gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I have currently set up 3 machines with Pacemaker 2.3 with Corosync
> 1.19. I have tested some scenarios and have encountered some problem which
> I hope to get some advice on.
> > >
> > > My scenario is as follows:
> > >
> > > The 3 machines, name A,B,C are all running with A being the node which
> started the resource as seen in cm_mon. If I were to cut off the network
> connection for A, B will take over as the node which started the resource.
> I then resume the network connection and start both corosync and pacemaker
> on A again
> >
> > Did you stop it there first?
> >
> > > and the node which started the resource now returns to node A.
> > > I have set stickness and perform an identical test but with proper
> shutdown of pacemaker and corosync and it is working fine.
> > > Is there anyway to perform a clean shutdown in the event that a node
> loses network connection so that it will not attempt to take back the
> resource it used to be holding before it was uncleanly shutdown?
> > >
> > > Thanks
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130724/7b489153/attachment-0003.html>