[Pacemaker] Resources not migrating on node failure?
Tim Serong
tserong at novell.com
Wed Dec 1 11:32:39 UTC 2010
On 12/1/2010 at 05:11 AM, Anton Altaparmakov <aia21 at cam.ac.uk> wrote:
> Hi,
>
> I have set up a three node cluster (running Ubuntu 10.04 LTS server with
> Corosync 1.2.0, Pacemaker 1.0.8, drbd 8.3.7), where one node is only present
> to provide quorum to the other two nodes in case one node fails but it itself
> cannot run any resources. The other two nodes are running drbd in
> master/slave to provide replicated storage, then XFS file system on top of
> the drbd storage on the master, together with an NFS server on top of the XFS
> mount, and a service IP address on which the NFS export is shared. This is
> all working brilliantly and I can cause the resources to move to the slave
> node by running "crm_standby -U cerberus -v on" where cerberus is the master
> node and everything then migrates to the slave node "minotaur".
>
> My problem is if I pull the power plug on the master node "cerberus". Then
> nothing happens! minotaur continues to run as slave and it never takes over.
>
> And I don't get why. )-:
Probably because STONITH is disabled. It can't take over the resources unless
it knows they're stopped, and without a clean shutdown, there's no way to
guarantee they're stopped without STONITH.
> Also, a second question, possibly related to the first problem, is do I need
> to define monitor actions for each resource or is that done automatically?
No, you need to define them.
> If I need to do it specifically, how do I do that now that I have it all up
> and running without defining monitor actions?
Run "crm configure edit" and add whichever monitor ops you need.
Have a look at Clusters from Scratch at:
http://www.clusterlabs.org/wiki/Documentation
HTH,
Tim
--
Tim Serong <tserong at novell.com>
Senior Clustering Engineer, OPS Engineering, Novell Inc.
More information about the Pacemaker
mailing list