[Pacemaker] How to failover when system is overloaded?

Thu Jun 5 00:04:53 UTC 2014

On 5 Jun 2014, at 5:58 am, Michael Monette <mmonette at 2keys.ca> wrote:

> Hi, 
> 
> Lately we have been having issues with our primary server becoming overloaded and basically unresponsive. I assumed that having a floating ip was enough, but it's not and the floating_ip resource does not fail to the second system.
> 
> Could someone tell me how they deal with this problem? Is there some resource agent where node-2 checks on node-1 and if there is no reply by X amount of time, takes the floating IP?

corosync/heartbeat are still functioning normally underneath?
is fencing configured?  what is the rest of your config? logs?

you need to give us something to work with

> 
> Pings seem to work fine, SSH is dead and the web service is dead also. So maybe thats why the IP isn't failing to node-2.
> 
> Thanks for any help.
> 
> Mike
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140605/cc2fe6a9/attachment-0004.sig>