[Pacemaker] best/proper way to shut down a node for service

Martin Seener martin.seener at barzahlen.de
Wed Jan 23 08:42:24 UTC 2013


Hi,

We have a 2-node active/standby PGSQL/DRBD Cluster with STONITH and we put one node in standby
Then shutdown pacemaker on this standby node (service pacemaker stop), wait some sec, then doing the same
With corosync (service corosync stop), again wait some seconds and always have a look at crm_mon –r on the active node.

After that, the standby nodes status should be OFFLINE (standby). Then we can safely reboot or shutdown this node.

When ist rebootet, we first start DRBD and let it sync completly – then restart corosync (wich autostarts pacemaker) with
Service corosync start. After some moments it will become "standby" again in the cluster and you can
Put it back online with crm node online <nodename>.

This works very well and we dont experience any crm hang on the active node like we did when we missed to stop pacemaker and then corosync
Before reboot.

Also you can put everything in maintenance-mode=true, but then even on the active node PGSQL isnt monitored (restarted if it shuts down), therefore
We only use maintenance if we really do manual steps to PG or updating the cluster software.

Greets from Berlin,

Martin


Von: Dan Frincu <df.cluster at gmail.com<mailto:df.cluster at gmail.com>>
Antworten an: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org<mailto:pacemaker at oss.clusterlabs.org>>
Datum: Wednesday, January 23, 2013 9:32 AM
An: The Pacemaker cluster resource manager <pacemaker at oss.clusterlabs.org<mailto:pacemaker at oss.clusterlabs.org>>
Betreff: Re: [Pacemaker] best/proper way to shut down a node for service

Hi,

On Wed, Jan 23, 2013 at 5:21 AM, Brian J. Murrell <brian at interlinx.bc.ca<mailto:brian at interlinx.bc.ca>> wrote:
OK.  So you have a corosync cluster of nodes with pacemaker managing
resources on them, including (of course) STONITH.

What's the best/proper way to shut down a node, say, for maintenance
such that pacemaker doesn't go trying to "fix" that situation and
STONITHing it to try to bring it back up, etc.?

Currently my practice for STONITH is to have it reboot.  Maybe it's a
better practice to have STONITH configured to just power a node down and
not try to power it back up for this exact reason?

Any other suggestions welcome.

I usually put the node in standby, which means it can no longer run
any resources on it. Both Pacemaker and Corosync continue to run, node
provides quorum.

For global cluster maintenance, such as when upgrading to a major
software version, maintenance-mode is needed.

HTH,
Dan


Cheers,
b.


_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org<mailto:Pacemaker at oss.clusterlabs.org>
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




--
Dan Frincu
CCNA, RHCE

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org<mailto:Pacemaker at oss.clusterlabs.org>
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130123/45ea5a72/attachment.htm>


More information about the Pacemaker mailing list